Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Tcl is faster than Python for database benchmarking (hammerdb.com)
55 points by otterley on Nov 1, 2022 | hide | past | favorite | 44 comments


Let’s make a list of languages that are slower than Python. That will take less time.


I can only think of MATLAB


Is this true though? Considering its usage would expect MATLAB to be on par with Python utilizing numpy/numba/etc.


It takes a whole 5 minutes on my computer just to start up the damn thing.


> At the time Tcl was the only multi-threaded GIL free scripting language for database access and this still holds true today.

Raku (nee Perl 6) is GIL-free IIRC. So are Julia and Erlang/Elixir, for sufficiently-loose definitions of "scripting language". The list probably goes on.


Raku is also, unfortunately, the slowest language I've ever used especially, paradoxically, for regex and string handling given its Perl lineage.


The tcl threads are shared nothing with message passing...each thread is a separate interpreter. So it's not obviously better or different than any other language running multiple processes with message passing via mmap or similar.


nmap has a serialization cost, and sharing complex data structures is hard with it.

It's also different than multiprocesses as it will likely take less RAM.

Python is explorer multiple interpreters for these reasons. Python 3.8 SharedMemory works well, it's far from transparent.


>mmap has a serialization cost

It does, though there should be optimized ways to do that if it's message passing and not object passing.


Lua doesn't have a GIL, either.


Indeed, Raku does not have a GIL


Perl is GIL-free too, actually.


PHP is GIL-free too.


> X is faster than python for Y

I feel that this statement is true for most values of X and Y. I've yet to find a mainstream language that's slower than python, or a task at which python is truly the most performant.


You can use the quantum supremacy argument and claim that python is the fastest at being python.


Except even that's not true, if you consider that, canonically, "python" means CPython -- and PyPy is faster!


Raku is slower.


Fun fact: because of tkinter, Python embeds a TCL interpreter:

    >>> import tkinter as tk
    >>> tk.Tcl().eval('puts {hello, world}')
    hello, world
    ''
So you could technically use TCL to query your DB if you cared.

But honestly, DB queries in real life is spread on threads, but then multiple processes, then a connection pool like pg bouncer, so you will unlikely see the difference most of the time. Not to mention the DB choice, the schema, the query quality, the DB configuration, the hardware and how complicated your request are all going to be more likely to affect the speed of request.

Still, interesting. Exploiting all those CPU in Python would cost a lot of RAM given how many processes you would have to spawn.

I would love to see the comparison using asyncio on uvloop to check the CPU are really the bottleneck though.


Understandable, SQlite was always bound to TCL.

Also, TCL/Tk was the RAD/Glue language for the 90's Unices. Kinda like Visual Basic. C backend, TCL glue, TK grid UI, DB connection.


I wonder when python crossed above tcl. GTrends only go back to 2004


I'd say right in the early 2000s and then Django starts becoming popular, and the rest is history. But the writing is really on the wall in the late 1990s due to the intuitiveness of the Python language and beginner friendly design compared to the other popular scripting languages at the time including Perl, TCL, PHP, Ruby, etc. Speed, however, is never Python's advantage and for scripting languages it's not really a factor.

I think based on Python background it's safe to say that in order to become a hugely popular programming language, any language need to be beginner friendly and/or need a killer application as a motivation for programmers to adopt it. Speed and performance are not important factors unless you are writing an OS and we have already C for that.


Before Django there was Zope.

Zope and a saner alternative to Perl were the two main reasons to care about Python in 2000.


In the late 90s i remember CGI studios being invested for glue language. Houdini and others also embedded it in their apps later.


Oracle used to use TCL for various purposes with the "intelligent agent".


TCL is such a neat language. I haven’t written that many blog post about it but it’s about as expressive as Perl. A true GTD language.


A nice little thing that TCL has "given" Python, so to speak, is tkinter. Absolutely basic, but it still did the job for me back in the day.

I remember that when I started learning Python about 20 years ago I used to write a Python script with a quick and dirty tkinter interface on my Linux desktop at home, I was then copying everything on a floppy-disk in order to use that script/program at work, on a Windows machine (with the help of py2exe, another very nice tool).


Can you link your blog for us?

Thank you


So is assembly but I’m going to keep using mainstream languages


Breaking: Area man discovers that python threaded performance sucks. Professionals urge him to use multiprocessing instead.

Snarky, yes, but nothing that wasn't known before this article.


Haha yeah ok you're right, but I like how he wrote it up in a nice and methodical manner, and the post sort of gives a sense that he enjoyed doing it and writing it up, which transfers. It's a peculiar thing that benchmarking has, I think, it becomes enjoyable in itself.


... for some definition of methodical that ignores multiprocessing.


Meh, mp is kind of a disgusting hack. Maybe it would have worked here but it's so easy to run into bullshit. Off the top of my head, if you want to share a TLS connection you're obviously fucked, if you use gRPC you need to specify an env flag to be fork safe, memory overhead that interacts poorly with reference counted GC, etc.

Again, in this example I suspect things would have worked, because it's trivial code, but I wouldn't actually bet on it because I've seen MP fuck up so many times historically. Like, maybe one of those modules actually creates a static connection pool that gets messed up on mp? Who knows?

So yeah, maybe the obscene hack that 'mp' is would have worked or just use tcl, which is sane and worked for them. A note about mp in the post would have made sense, I just won't blame someone for not wanting to deal with it.


> if you want to share a TLS connection you're obviously fucked...

Some would argue that if you're using threads, well then you're fucked from the start, very possibly including me.

Go gets around this using go-routines, but go isn't necessarily thread safe per se. Futher go also has it's own bag of bugs you have to be aware of when using it.

I think maybe the true answer here is rust.


Particularly since tcl threads don't share anything. It's one thread per interpreter.


I wonder why are these called "threads" then.

The very point of threads (as opposed to parallel processes) is the easy access to common memory and other resources.


Erlang is fairly famous for working in a similar way.


In fairness, that is in the pipe for python for a release soon. One interpreter per thread.


Not quite, subinterpreters can contain multiple threads each. But you're right in that if you limit yourself to one thread per subinterpreter, then you should be able to get good multicore scaling that way, when more fully fleshed out subinterpreters arrive in 3.12. The trick then becomes how to efficiently communicate between them. Interesting times ahead.


You mean multiprocessing where communication goes either through serialization or through shared-memory where you have to use C-style programming and you lose all the advantages of e.g. a garbage collector and Pythonic datatypes?


Uh... did you like actually read his code? He has no shared state across threads. It's a perfect example for multiprocessing.


Yeah the article felt intellectually honest and I enjoyed it, but was hoping for at least a footnote that multithreading is 1 solution. Async/await is another possible solution. There appears to be no shared state. But in fairness, python's state of multiprocessing, multithreading, and async/await just leaves too many options for a general language programmer to keep track of and perhaps it's just lack of knowing these things exist.


The article does indeed mention:

> (And although the Python test script could run in multiple processes HammerDB is an application simulating multiple virtual users in a GUI and CLI environment requiring the close interaction and control of using multiple threads).


Eh yes overlooked that but doesn't touch on async methods. I think there is an async postgres connector out there.


This is the critical point that most commenters are missing or ignoring!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: