If PEP 703 is accepted, Meta can commit three engineer-years to no-GIL CPython

js2 · on July 8, 2023

Removing the GIL will not break the vast majority of existing Python code. Here's the backwards compatibility section of PEP 703:

https://peps.python.org/pep-0703/#backwards-compatibility

That section is short and I encourage you to read it in full. Here's what it says about how removing the GIL affects existing Python code:

• Destructors and weak reference callbacks for code objects and top-level function objects are delayed until the next cyclic garbage collection due to the use of deferred reference counting.

• Destructors for some objects accessed by multiple threads may be delayed slightly due to biased reference counting. This is rare: most objects, even those accessed by multiple threads, are destroyed immediately as soon as their reference counts are zero. Two places in the Python standard library tests required gc.collect() calls to continue to pass.

That's it with respect to Python code.

Removing the GIL will require a new ABI, so existing C-API extensions will minimally need to be rebuilt and may also require other changes. Updating C-API extensions is where the majority of work will be if the PEP is accepted.

This will be an opt-in feature by building the interpreter using `--disable-gil`. The PEP is currently targeted at Python 3.13.

This is nothing like the Python 2 > 3 transition.

omnicognate · on July 8, 2023

> existing C-API extensions will minimally need to be rebuilt and may also require other changes.

This vastly understates the work involved. Most C extensions and embeddings will require major structural changes or even rewrites. These things are everywhere and are a major reason for python's popularity. A typical financial instition, for example, will have a whole bunch of them, with a morass of python code built on top. Many of these companies took ages to transition to python 3 (and many still have pockets that haven't!). For them to remove GIL reliance from their C extensions is a much bigger ask than migrating to 3 was, and their response to being asked to do it is likely to be blunt.

The fact that only a small minority of developers involved in python directly use the C API (and fewer understand the consequences of GIL removal for them) means the issue tends to be overlooked in discussions like this, but it's the reason a GILectomy will be worse than 2 to 3. In practice it's likely to end up being either a fork or a mode you have to switch on/off, both of which would be miserable for everyone.

The GIL is part of python's success story. It made it easy to write extensions, and the extension ecosystem made the language popular. Every language doesn't have to converge to the same endpoint. Different tools are suited to different jobs. Let it be.

gjulianm · on July 8, 2023

I’m not sure that most will require rewrites.

First, if the extension isn’t being used in a multithreaded environment, nothing should change. Yes, it isn’t thread safe, but it doesn’t really matter in that context. And given how bad GIL-Python works with threads, I doubt the majority of extensions are written for multithreaded applications.

And the ones that are written for multithreaded are probably already releasing the GIL for long running computations, so they should be written with at least a little bit of thread safety in mind. And in the worst case it shouldn’t be too difficult to hack the code to include a GIL-like lock that needs to be held by library users in order to ensure thread safety without really changing the architecture of the extension.

bjourne · on July 8, 2023

There is no way of knowing whether a Python module is used in multithreaded environments or not. The point is that C extensions that previously worked fine in multithreaded environments probably will not work fine in multithreaded environments with the GIL disabled.

schmeii · on July 11, 2023

That is not true.

If an extension doesn't advertise that it is thread safe, the GIL get activated at runtime so it will work exactly like it does now.

CJefferson · on July 9, 2023

In my personal experience almost every large Python program is a little multithreaded.

For example, if you need to handle the external programs input, stdout and errout, most of the examples you find by googling use threads for this.

norir · on July 9, 2023

I'm sure you know this, but as a general psa: threads are the lowest common denominator way of doing this kind of thing. Where possible, it is nicer to use epoll/kqueue/whatever the windows equivalent is. For a library though I totally get how annoying it would be to interface with the platform event system on each os. Indeed, if there isn't already a python library that can handle subprocesses for you on a single thread then it might be something worth building (I don't use python personally so I don't know the ecosystem).

cpgxiii · on July 8, 2023

> Most C extensions and embeddings will require major structural changes or even rewrites.

A lot of those important "extensions" are actually C and C++ libraries designed for parallelism in their native use which have been made available in Python via bindings (e.g. Pytorch). The cores of these libraries are fine, only the binding layers (often automatically/semi-automatically generated) may need to be updated. I suspect only the C-API extensions designed from scratch to only be extensions are going to have major problems with a no-GIL world.

> Every language doesn't have to converge to the same endpoint. Different tools are suited to different jobs.

People have decided they want to use Python frontends. Plenty of alternative languages could have won out, which would have provided native threading, faster runtimes etc, but they didn't; we are stuck with Python. The continued existence of the GIL is incredibly restrictive on how you can leverage parallelism both in Python itself and in extensions. Only in the simple cases can you make the Python<->extension boundary clean: the second you want to be able to call Python from extension code (e.g. via callbacks or inheritance) the GIL stands in your way.

0cf8612b2e1e · on July 8, 2023

For every popular, well-engineered extension, I suspect there are a dozen hacked-together ones which will break the moment the GIL guarantees disappear.

cpgxiii · on July 8, 2023

I'm sure there are a lot of hacky extensions out there (particularly hiding in proprietary codebases), but letting the entire future of Python be held hostage to the poor SWE choices of third parties who almost certainly contribute nothing back is not a sustainable path.

0cf8612b2e1e · on July 8, 2023

But it is a programming language. You do not break backwards compatibility lightly. Nobody has ever chosen Python for its runtime performance. Unfortunately, sometimes you have to accept the technical debt cannot be escaped.

cpgxiii · on July 8, 2023

> Nobody has ever chosen Python for its runtime performance.

No, they choose it for the ease of using its performant extensions. And those extensions are fundamentally limited in performance by the existence of the GIL. And the authors of those extensions (and their employers) are behind the work to get rid of it.

There are three groups here:

1. "Pure" Python users, whose code makes little/no use of extensions. GIL removal is an immediate win for these uses.

2. "Good" extension users/authors, whose code will support and benefit from no-GIL. GIL removal is an immediate win for these uses.

3. "Bad" extension users/authors, whose code doesn't/can't support no-GIL. GIL removal probably doesn't break existing code, but makes new uses potentially unsafe.

Maintaining the technical debt of the GIL indefinitely so that group 3 never needs to address its own technical debt is not a good tradeoff for groups 1 and 2 which actively want to move the language forwards.

0cf8612b2e1e · on July 8, 2023

Don’t break user space.

The above arguments could apply to any breaking changes that a host wants to implement. Those who don’t directly consume the change, those who can adapt, and everyone who is fine with the status quo.

Python is free to do as it pleases, but this breaking change is going to result in a lot of churn.

> GIL removal is an immediate win for these uses.

As a minor aside, the GIL free Python version was actually a performance regression, somewhere between 5-10% impact. https://peps.python.org/pep-0703/#performance . So, an immediate loss for nearly everyone.

omnicognate · on July 8, 2023

No, this is false. Relying on the GIL isn't bad practice in a python extension. It's something you have no choice but to do. The GIL is the thread safety guarantee given to the extension by the python intepreter. It's the contract you have and you have no choice but to code to it. Removal of the GIL requires an alternative contract representing a finer-grained set of thread-safety guarantees for people to code against. Programmers who coded against the contract that existed rather than somehow divining the future and coding against a different contract that hadn't been invented yet (while also managing to make it compatible with the existing one) weren't doing something wrong.

But besides being incorrect, the idea that extensions that break on removal of the GIL are "bad" is irrelevant. The question is what the ecosystem will bear. The people who own the codebases I'm talking about are not pushing for GIL removal. They have large, working codebases they're using in production and the cost of redesigning large chunks of them would be real. GILectomy is being pushed by people with varied motives but they certainly don't speak for everyone and their case isn't going to be made successfully by casting aspersions on those with different priorities.

rmahrt · on July 9, 2023

Relying on a published guarantee of CPython is not technical debt!

What will happen is that extensions that don't work with the new behavior will be assimilated by the proponents. Support for the new behavior will be hacked in, probably with many mistakes.

The bad results will then be marketed as progress.

fragmede · on July 9, 2023

it is now! just like that bill you didn't know you owed until it came in the mail, technical debt builds up regardless of when you know when it's owed.

keeping a gil when entering and exiting c extensions seems like an obvious steppingstone.

Attummm · on July 9, 2023

Would it be possible to define "win" for your point 1 and 2?

josefx · on July 9, 2023

> Nobody has ever chosen Python for its runtime performance.

How many people do you think would complain if the python devs. replaced numpy and pythorch with pure pyhton implementations for an April fools joke?

kelnos · on July 9, 2023

That doesn't contradict what the GP said. Those modules are largely written in C, not Python. And that is because nobody would choose Python for its performance.

josefx · on July 9, 2023

> And that is because nobody would choose Python for its performance.

Python the runtime may not be the most performant by itself, but python the ecosystem only took of because of performance hacks like those popular modules, pypi, cython, ... .

> Those modules are largely written in C, not Python.

Yet people choose C to implement those modules because Python is not performant enough and isn't that a nice thought? Lets do something good for humanity and drive another stake into that old abominations heart by making it less relevant.

toyg · on July 9, 2023

What? Pypy and cython are hardly the reason for python "taking off".

The issue with python is that the ecosystem is so big, everyone thinks their niche is the one python was "built for". Data science, web backends, sysadmin, app scripting, superclusters, embedding... They all think they're the biggest dog, when the reality is that they're more or less all the same. Pypy and cython are probably popular in some of those niches, but python is not just about them.

The reality is that python "took off" because of two things:

- the syntax

- the ease of interfacing with other worlds, typically (but not limited to) C/C++

Removing the GIL seriously threatens the second reason, because integrations that used to be fairly easy will have to be fundamentally rearchitected. Moving from Python 2 to 3 was trivial in comparison, and it took a decade; this transition might take much longer, or never really happen, with massive losses for the ecosystem. And all for what, some purist crusade that interests only a fraction of the whole community...?

TillE · on July 8, 2023

Stuff breaks in software all the time, it's really not a huge deal. You either update your code or stick to an older version.

rmbyrro · on July 9, 2023

Did you read the part that you'll have to opt in and build the interpreter with --disable-gil?

If it doesn't work for you, use the GIL enabled. What's the problem of letting others disable it?

wokwokwok · on July 9, 2023

Come now, how many people compile python to use it?

How many people will be surprised / upset that they have to do that once `--disable-gil` is mainlined into the default builds on python.org?

How long before 'build it yourself and use --disable-gil' becomes 'we've enabled --disable-gil by default for Great Victory for Most Users'?

I think it's very very naive to think that the flag will remain an obscure 'use if you want to build python yourself' if the PEP is successful.

> The global interpreter lock will remain the default for CPython builds and python.org downloads.

Is a joke. It means, for now. For the work in this PEP.

I mean, you've literally got the folk from numpy saying:

> Coordinating on APIs and design decisions to control parallelism is still a major amount of work, and one of the harder challenges across the PyData ecosystem. It would have looked a lot different (better, easier) without a GIL.

That isn't 'for a few people who are vaguely interested in obscure technical stuff'; this is the entire python data ecosystem, saying "this will be the default in the future because the GIL is a massive pain in the ass".

> If it doesn't work for you, use the GIL enabled.

Sure, it can be opt-out; but you're fooling yourself if you think this is going to be 'opt-in'.

It will be the default once it's completed, sooner or later, imo.

...and when the 'default' build of python breaks c-extensions, that is a breaking change, even if technically you can build your own copy of python with different compile flags.

AlphaSite · on July 9, 2023

It will probably be something like:

    brew install [email protected]
    apt install python3.12-mp

Etc. I don’t think anyone actually needs to compile it themselves, unless they really want to.

wokwokwok · on July 9, 2023

Long term? I doubt it. Who’s going build those packages?

Two major parallel incompatible implementations of Python, maintained at the same time, duplicating all efforts at testing, dev, build, release train, etc. while both are maintained.

Does it sound familiar?

Even ignoring the obvious parallels to Python 3000, there’s a limited amount of time people will be bothered maintaining both.

schmeii · on July 11, 2023

Probably in the long-term the two build-mode will be merged and there will only be a "nogil" build-mode.

Which doesn't mean there will be no GIL. Contrary to popular belief the goal of PEP 703 is not to remove the GIL, but only to be able to disable at runtime if you need free threading, so it is an opt-in feature.

In the interim, it seems that Conda have volonteered to build the extensions that are frequently used in the scientific community.

dataflow · on July 9, 2023

> A typical financial institution

Could you elaborate on how the Python ecosystem will suffer if these financial institutions get left behind? What sorts of contributions come from these users in particular?

paulddraper · on July 9, 2023

Python will lose marketshare/mindshare.

StackOverflow answers, side projects, fanboys

klyrs · on July 8, 2023

> > This will be an opt-in feature by building the interpreter using `--disable-gil`. The PEP is currently targeted at Python 3.13.

> In practice it's likely to end up being either a fork or a mode you have to switch on/off, both of which would be miserable for everyone.

Yep. That's the plan.

faitswulff · on July 8, 2023

> Yep. That's the plan.

The misery, you mean?

lyu07282 · on July 9, 2023

Have you seen PIP, conda, asyncio from 3.4 to 3.8, the standard library documentation, the great 2 to 3 migration, type "annotations" and mypy, the variable scoping "rules" or the multiprocessing module?

We python users are more than ready for whatever pain we have to deal with in the future, we are used to misery we live it every day.

klyrs · on July 8, 2023

True Pythonistas love misery, almost as much as we love kvetching. Without the misery, we'd have nothing to talk about. Don't get me started on typing.

js2 · on July 9, 2023

> Most C extensions and embeddings will require major structural changes or even rewrites.

Sam claims otherwise: "Most C API extensions don’t require any changes, and for those that do require changes, the changes are small. For example, for “nogil” Python I’m providing binary wheels for ~35 extensions that are slow or difficult to build from source and only about seven projects required code changes (PyTorch, pybind11, Cython, numpy, scikit-learn, viztracer, pyo3). For four of those projects, the code changes have already been contributed and adopted upstream. For comparison, many of those same projects also frequently require changes to support minor releases of CPython."

https://discuss.python.org/t/pep-703-making-the-global-inter...

In the following talk, he shows the scikit-learn changes that were needed:

https://youtu.be/9OOJcTp8dqE?t=1958

I'd also like to point out that the GIL does not inherently make C extensions thread safe. It only protects the interpreter itself. For example, calling in to the C API can already release the GIL as I explained in reply to /u/bjourne here:

https://news.ycombinator.com/item?id=36649769

jtokoph · on July 8, 2023

Could there be a shim between the two that emulates the original behavior?

schmeii · on July 11, 2023

There already is such a mechanism in the pep.

Basically there will be two build mode.

The standard build that you get at python.org will work exactly as now (so with a Gil).

The disable-gil build (which will most likely be available through conda or similar) will have the option to run either with the GIL or without. It will automatically detect extension that are not compatible and switch to Gil mode but you can override this with an environment variable.

bsder · on July 8, 2023

> This vastly understates the work involved. Most C extensions and embeddings will require major structural changes or even rewrites.

Why?

Once you have a threaded Python, you can allocate one thread to run with a GIL and the other threads without it. Old extensions can access one thread with the old GIL API and new extensions can access all the things with the GIL-less API.

People who want the performance will rewrite their extension. People who don't, won't.

schmeii · on July 11, 2023

I don't think that is possible to activate the GIL only for one thread, it has to be activated process-wide.

But, if you want free-threading but you have to use an old extension that is not GIL-less, then you still can, by garding all access to this extension with a lock and setting the PYTHONGIL variable to 0.

But, you will need to compile the extension with --disable-gil or to obtain that from somewhere, if the extension author doesn't provide it, as the ABI of "standard" python and "--disable-gil" python is different.

How --disable-gil builds of python and of extensions will be distributed is not yet totally clear. I don't think pypi has already a clear idea of how to handle those extension, for instance.

tsukikage · on July 8, 2023

That's not quite good enough - people use threads to e.g. permit multiple blocking i/o operations to be in flight at once, which works perfectly well right now. So you're going to want to support multiple GIL threads if you want to be able to support existing code safely.

Sadly, you can't just trivially bolt on thread safety to code that was not designed with it in mind.

However, if you go the other way and introduce e.g. an @nogil decorator, similar to that seen in some cpython alternatives, people will have a straightforward path to opt in incrementally as they fix and verify the critical parts of their code, while preserving known working behaviour elsewhere without having to throw entire complex systems over the wall at once.

semi · on July 10, 2023

would a fork be so bad? maybe I'm off but I feel like most people have no real need for GILless and would leave it off if they were even aware it existed. Those who would need it the most are the most equipped to handle the cost of adoption.

dzsekijo · on July 9, 2023

However, these days AI assisted tools can be involved in the task.

floomk · on July 8, 2023

It would be helpful and interesting if you could explain why the impact is so huge

gary_0 · on July 8, 2023

They're talking about C code that blindly relies on implicit concurrency guarantees that are now conditional. Isn't that enough said?

mrlyx · on July 8, 2023

Not at all. The GIL was explicitly part of the C-API! No one used the guarantees accidentally, including CPython itself.

The GIL was one of the reasons why the C module ecosytem took off.

lanstin · on July 8, 2023

And the probability of a given piece of C code that was written for a thread safe environment working when called from multiple threads at once is pretty low for anything not 100% purely functional.

CJefferson · on July 9, 2023

Honestly, that’s not my biggest worry. My bigger worry is that now Python variables your extension is reading can be changed by another thread in real time. Before if you held the GIL, you knew nothing would change while you poked some Python datastructures.

gary_0 · on July 8, 2023

That's not the implicitness I'm talking about. While it sounds like loading old C modules will just re-enable the GIL, the problem is that they will never be updated to not rely on the old Python concurrency model. All that C code was written implicitly assuming that certain blocks of code were surrounded by a GIL.

It could be a real headache for any hoped-for transition to nogil Python if lots of GIL-reliant C code is floating around where there's little hope of updating it without having to worry about subtle bugs popping up. And even if the conversion was risk-free (which I doubt), many organizations will still not want to dig into their legacy C codebases and make significant changes.

rogerbinns · on July 8, 2023

I'm the author of a Python C extension, and having the GIL gone will be a lot of work. Code currently looks like this:

* C function called with GIL held

* Extract data needed to do work

* Release GIL

* Do work

* Reacquire GIL

* Modify data, build result

* Return

As an example of the changes, a list could be passed in. I would need some form of locking while processing that list so that mutations while processing won't crash the code.

The GIL does currently result in robust code by default because data can't mutate underneath you. Without the GIL the code will appear to work, but it will be trivial for an attacker to use mutations to crash the code. Expect huge numbers of CVEs.

js2 · on July 8, 2023

Nothing between Release GIL and Reacquire GIL needs to change. Depending upon your extension, possibly nothing needs to change for the other steps either. Per Sam Gross:

> Most C API extensions don’t require any changes, and for those that do require changes, the changes are small. For example, for “nogil” Python I’m providing binary wheels for ~35 extensions that are slow or difficult to build from source and only about seven projects required code changes (PyTorch, pybind11, Cython, numpy, scikit-learn, viztracer, pyo3). For four of those projects, the code changes have already been contributed and adopted upstream. For comparison, many of those same projects also frequently require changes to support minor releases of CPython.

If you're using the GIL to protect access to non-Python objects, that will need to change.

The PEP mentions a future HOWTO on how to updating existing extensions. I wish that were already written.

There's disagreement among Python maintainers in the discussion thread on the PEP in how much work will be involved and I don't expect to resolve it here.

gjulianm · on July 8, 2023

As far as I know PEP703 includes provisions to make operations on containers thread-safe. That should at least avoid most crashes.

Borrowed references can be more problematic, but it seems most cases could be fixed by replacing GetItem with FetchItem calls.

Overall, as another C Python extension, I don’t really think it’s going to be that much of a pain. In fact, I could even get away with no changes (other than build fixes and such) if, for example, I guarantee that each “main object instance” is only accessed by one thread, and I’d still get a lot of benefits from the nogil.

bjourne · on July 8, 2023

Here is code you see all the time in extensions:

    if (PyList_Size(lst) >= 3) {
        PyList_SetItem(lst, 0, ...);
        PyList_SetItem(lst, 1, ...);
        PyList_SetItem(lst, 2, ...);
    }

Whether the individual operations are thread-safe is irrelevant. For the code to be thread-safe it needs to acquire lst's lock before the if statement and release it afterwards.

js2 · on July 9, 2023

Your example is not thread safe because the GIL does not protect critical sections like that. The GIL only protects the Python internal interpreter state. PyList_SetItem discards a reference to the item being replaced. If the ref count of the item becomes zero, the item's destructor is run. The destructor can release the GIL.

IOW, your example is not much different than the equivalent Python code:

  if len(lst) >= 3:
    lst[0] = "..."
    lst[1] = "..."
    lst[2] = "..."

You'd never expect that to be thread safe either without using something like a threading.RLock() around it.

Let's look at the interpreter code:

1. PyList_SetItem calls Py_XSETREF(dst, src) where dst is the list position:

https://github.com/python/cpython/blob/ee46cb6aa959d891b0a48...

2. Py_XSETREF calls Py_XDECREF on the old item:

https://github.com/python/cpython/blob/ca8b55c7f54b38e264056...

3. Py_XDECREF calls Py_DECREF on non-Null objects:

https://github.com/python/cpython/blob/main/Include/object.h...

4. Py_DECREF can cause arbitrary code to run:

Warning: The deallocation function can cause arbitrary Python code to be invoked (e.g. when a class instance with a __del__() method is deallocated).

https://docs.python.org/3/c-api/refcounting.html#c.Py_DECREF

This is backed up by the documentation for PyList_SetItem:

This function “steals” a reference to item and discards a reference to an item already in the list at the affected position.

https://docs.python.org/3.8/c-api/list.html#c.PyList_SetItem

The "Defining Extension Types: Tutorial" also includes this warning in an example that uses Py_XDECREF:

But this would be risky. Our type doesn’t restrict the type of the first member, so it could be any kind of object. It could have a destructor that causes code to be executed that tries to access the first member; or that destructor could release the Global interpreter Lock and let arbitrary code run in other threads that accesses and modifies our object.

https://docs.python.org/3/extending/newtypes_tutorial.html#a...

bjourne · on July 10, 2023

Yes, you are absolutely right, but that doesn't change the fact that many extensions are written as if innocuous function calls like PyList_SetItem won't context switch. It mostly works fine since custom destructors releasing the GIL are extremely rare.

js2 · on July 12, 2023

"Mostly works fine."

I dunno, I'd rather have a reliable bug than a heisenbug.

Any class with a __del__() method is going to release the GIL. That's not uncommon. My intuition is that threaded Python is rarer than classes with a __del__() method.

singhrac · on July 8, 2023

I don't know about the complexity of your extension, but the PEP provides per-container locks ("This PEP proposes using per-object locks to provide many of the same protections that the GIL provides. For example, every list, dictionary, and set will have an associated lightweight lock. All operations that modify the object must hold the object’s lock").

This is automatic via the PyList_GetItem/SetItem API, so I guess the error you're talking about is that you read a list y = [A, ...], your code reads A and copies the data to (new) A2, and then you iterate over it again and see that A != A2 because another thread has modified y?

ilc · on July 8, 2023

Yeah, and then everyone has to have the same locking hierarchy, and be conscious of it.

This sounds like a nightmare waiting to happen.

pierat · on July 8, 2023

C'mon folks! MOD THIS UP

This is the lead author of the majorly potential impacted extension!

ilyt · on July 8, 2023

> This is nothing like the Python 2 > 3 transition.

Well what py2to3 promised was also not what happened and by far one of worst disasters of a migration in history of open source so you can't blame people for being skeptical

bazoom42 · on July 8, 2023

He, the Perl 5 to 6 migration is already forgotten? I think this says something about how well it went, because at the time Perl was more popular than Python.

JoshTriplett · on July 8, 2023

There was no Perl 5 to Perl 6 migration. Perl 6 was announced, a bunch of design work happened on it, and then it became a different language run by different people rather than a version of Perl. People are still writing Perl code extensively, and Perl 5 is still maintained.

paulddraper · on July 9, 2023

"How bad was the migration?"

"It didn't happen"

bazoom42 · on July 9, 2023

It didn’t happen, but it still killed the language.

0cf8612b2e1e · on July 8, 2023

I would expect Python to follow a similar language split. People were disappointed at how few breaking changes occurred in 2->3. If there was going to be a 3->4 migration, will be a large number of proposals to correct other deficiencies in the language.

ilyt · on July 9, 2023

... I mean if you want to compare that it had exactly same problem so it is another evidence it's not something specific to Python, it's just utterly terrible way to go ahead.

Only difference is that they noticed it's a bad idea midway and went back to developing Perl 5

And I still can write "use v5.8" in my Perl script and write script that works in Centos 5 instance from 2007, if I needed to do something on legacy system, and it will work the same on latest Perl.

> because at the time Perl was more popular than Python.

I don't believe that was true at the time as P6 took like decade to decide on semantics and only then the implementations started to happen. More existing code, sure, but much of the web dev moved to PHP/Ruby and many other things to Python.

And unlike Py3 it was initially also much slower than P5, making migration kinda worthless.

staunton · on July 8, 2023

What are some good examples of meaningful migrations of similar scale that went better?

garbagecoder · on July 8, 2023

Objective C to Swift.

local_crmdgeon · on July 8, 2023

.NET Framework to dotnet core

0cf8612b2e1e · on July 8, 2023

Maybe I am off base here, but big migration efforts feel like they would be significantly easier in a compiled language. Potentially not a fair comparison when the tooling can automatically push so much code without errors.

schmeii · on July 11, 2023

This pep only affect the C world. It will not require change in Python code.

coding123 · on July 8, 2023

A lot of old Java people have recently got into Python, and enjoy the types of py3

sigzero · on July 8, 2023

No it wasn't and not by a long shot.

esafak · on July 8, 2023

What's your top three?

leoh · on July 8, 2023

[flagged]

croshan · on July 8, 2023

Posting a Wikipedia link without commentary or context is essentially a text meme. It doesn’t invite discussion.

What are you trying to say?

(Not that I disagree with your point, but it’s unclear.)

myhf · on July 8, 2023

https://en.wikipedia.org/wiki/Thought-terminating_clich%C3%A...

leoh · on July 8, 2023

This was not my intention —

But I enjoy the irony.

Invoking the concept of “thought terminating cliche”… as a thought terminating cliche.

wizzwizz4 · on July 8, 2023

https://existentialcomics.com/comic/9

leoh · on July 8, 2023

> Posting a Wikipedia link without commentary or context is essentially a text meme. It doesn’t invite discussion.

You literally just replied and we are engaging discussion — empirically the wiki link above has done literally the opposite of what you are saying! At any rate, the intended relevance of the wiki link regarding loss aversion is the following:

When something bad happens to someone, they tend to overestimate how bad it is relative to something experienced as objectively equally good; it is thus important to be aware of such a bias — in addition to using regular old critical thinking.

In this case, critical thinking could look like the following: “although this is a ‘migration’, what do we mean when we say that? and is it really the same thing as py2 to py3?”

I would tentatively propose “no” to the latter question; and would additionally propose that calling this a “migration” is not terribly useful as although it’s not incorrect, it’s insufficiently specific and seems to invite a category error.

In addition to the loss aversion bias mentioned above.

Category error: https://en.wikipedia.org/wiki/Category_mistake

zzzeek · on July 8, 2023

> Destructors and weak reference callbacks for code objects and top-level function objects are delayed until the next cyclic garbage collection due to the use of deferred reference counting.

this actually does "break" a lot of things, as you would be surprised how much code implicitly relies upon cPython's behavior of immediately calling weakref callbacks when an object is dereferenced. This is why keeping test suites running on pypy can be difficult, because it has the latter behavior.

> Removing the GIL will require a new ABI, so existing C-API extensions will minimally need to be rebuilt and may also require other changes. Updating C-API extensions is where the majority of work will be if the PEP is accepted.

as you note, there will be *two* versions of the Python interpreter.

that means every C extension has to be built *twice*. against both versions of the interpreter. Go look at how many files one must have available when publishing binary wheels: https://pypi.org/project/SQLAlchemy/#files the number of files for py3.13 now doubles. not clear if we actually have to have both Python builds present, so I would have like /opt/python3.13.0.gil and /opt/python3.13.0.nogil ? if the gil removal changes almost nothing, why have two versions of Python?

imtringued · on July 8, 2023

I don't see why you couldn't use updated "No GIL" extensions with the GIL interpreter. The "No GIL" interpreter mode will simply abort if you load an unsupported C extension.

bsdz · on July 8, 2023

It's worth highlighting that the GIL will still be available even when compiled with --disable-gil.

> The --disable-gil builds of CPython will still support optionally running with the GIL enabled at runtime (see PYTHONGIL Environment Variable and Py_mod_gil Slot).

packetlost · on July 8, 2023

Why not do it the other way? Have an opt-in at runtime for no-gil?

johnmaguire · on July 8, 2023

Because in general, removing the GIL should be an improvement.

gcbirzan · on July 8, 2023

In general, until you do use threads, removing the GIL is a performance loss. Even if you do use threads, you have an overhead: https://peps.python.org/pep-0703/#performance

packetlost · on July 10, 2023

Oh, that makes sense! I forgot that there would be overhead related to the changes for no GIL.

d1l · on July 8, 2023

You've conveniently described python behavior but lots of C code relies on the gil implicitly and will need to add locking to be correct in a nogil world.

I'm not saying for sure this is bad! I do think it is dishonest though about the potential impact. Lots of critical libraries are written in C.

ameliaquining · on July 8, 2023

This is why the proposal is to, by default, reenable the GIL at runtime (and print a warning to stderr) whenever a C extension is loaded, unless that extension explicitly advertises that it does not rely on the GIL.

js2 · on July 8, 2023

Dishonest? I didn't hide that fact: "Removing the GIL will require a new ABI, so existing C-API extensions will minimally need to be rebuilt and may also require other changes. Updating C-API extensions is where the majority of work will be if the PEP is accepted."

electroly · on July 8, 2023

Yeah but everyone knows that is obviously the biggest issue. Nobody was really concerned about the Python code which is the part that you said will go swimmingly—that's table stakes for a change at all. We already assumed the pure Python code would upgrade easily; a change would have no chance whatsoever of being accepted otherwise. Everyone is worried exclusively about the C extensions, and always has been. This seems to have been presented as a new approach to GIL removal that fixes the problem with C extension breakage but it's just the exact same old approach we've always been considering that breaks the C extensions. No-GIL ain't done until all the C extensions run, especially NumPy.

js2 · on July 8, 2023

> Nobody was really concerned about the Python code

This was the top comment when I added my original comment:

"Fundamentally every single piece of python code ever written will have to stop and now worry about potential race conditions."

https://news.ycombinator.com/item?id=36644114

Most of the other comments here at the time were similarly from people who clearly hadn't read the PEP saying it would break all existing Python code.

So I did my best to represent the backwards compatibility section of the PEP. I told folks to go read it and linked to it. I cited the portion relevant to Python code. There were too many bullet points for the C-API, so I summarized it with the disclaimer "Updating C-API extensions is where the majority of work will be if the PEP is accepted."

I also read the PEP discussion thread where there was disagreement from Python maintainers on how much work would be required of C extension authors, but most of the folks stating it would be a lot of work hadn't seemed to actually have tried to port anything to the new API. Meanwhile Sam had asserted that:

> Most C API extensions don’t require any changes, and for those that do require changes, the changes are small. For example, for “nogil” Python I’m providing binary wheels for ~35 extensions that are slow or difficult to build from source and only about seven projects required code changes (PyTorch, pybind11, Cython, numpy, scikit-learn, viztracer, pyo3). For four of those projects, the code changes have already been contributed and adopted upstream. For comparison, many of those same projects also frequently require changes to support minor releases of CPython.

I don't think I've misrepresented anything.

> No-GIL ain't done until all the C extensions run, especially NumPy.

Sam already got NumPy working without the gil:

https://youtu.be/9OOJcTp8dqE?t=1958

rcme · on July 8, 2023

I think the word "minimally" makes it sound like the changes to existing libraries are "to an extremely small extent; negligibly." "At a minimum" would fit better, because the minimum of a set of things can still be very large whereas minimally implies the quantity is very small.

eikenberry · on July 8, 2023

But it was downplayed in the comment when it was always the #1 reason for keeping the GIL. Python level changes were never a serious part of the argument.

Waterluvian · on July 8, 2023

which python

which python3

which python3nogil

js2 · on July 8, 2023

What do you propose as an alternative? I write code in a lot of languages and I can't think of a single one where I don't have to consider the version. This applies to C, node, ruby, swift, gradle/groovy and java at least. Even bash. When developing for Android and iOS, I have to consider API versions.

echelon · on July 8, 2023

Almost every third party ML model I look at seems to have different versions, different dependencies, and requires deliberate trial and error when creating container images. It's a mess.

Having interpreters and packages strewn across the machine is a nightmare. The lack of standard tooling has created a lawlessly dangerous wild west. There are no maps, no guardrails, and you have to beware of the hidden snakes. It goes against the zen of python.

As a counter example, Rust packs everything in hermetically from the start. Python4 [1] could use this as inspiration. Cargo is what package and version management should be, and other languages should adopt its lessons.

[1] Let's make a clean break from Python3 even if we don't need a new version right now.

bugglebeetle · on July 8, 2023

The ML community has horrendous engineering practices. Everyone knows this. This isn’t the fault of Python, nor should Python cater to people who build shoddy scaffolding around their black boxes.

disgruntledphd2 · on July 8, 2023

I mean, you're not entirely wrong but Python really really doesn't make it easy.

Consider R, which is filled with the same kind of people. There's one package repository and if your package doesn't build cleanly under the latest version of R, it's removed from the repo.

Don't get me wrong, this has other problems but at least it means that all packages will work with a single language version.

pphysch · on July 8, 2023

> I mean, you're not entirely wrong but Python really really doesn't make it easy.

That's a vast exaggeration. It is not "really really" hard to spin up a venv and specify your requirements. People just don't do it, and blame the tools for what are bad engineering practices agnostic to any language.

"Really really" not easy would be handling C, C++, etc. dependencies.

lanstin · on July 11, 2023

Generally that is a straight forward process of compiling, reading the error message, googling “$dist install $dirname of missing dep” running the apt-get / emerge / yum “ command and then repeating the compile command. Sometimes people will depend on a rare and not bundled dep, but not that often. Worst case you need to upgrade auto make tool chain or rebuild boost or something.

Maybe more time than getting python deps to work but more deterministic and takes less cleverness.

disgruntledphd2 · on July 9, 2023

I work in data science in python (and the parent was about ML) and basically everything in that space has C and Fortran level dependencies and this is where Python is really really bad, so no it is not as simple as you're making out.

I really really wish it was, as then I wouldn't have had to learn Docker.

bugglebeetle · on July 8, 2023

Python is a much older and generalist language than R, so yes, while it would be great to impose this kind of order on things, it’s not practical for its current extent of use.

That being said, after two decades of using Python professionally, the only really problems I’ve ever encountered are “package doesn’t support this version for {reasons}” and “ML library is doing something undocumented and/or dumb that requires a specific Python version.” The former is normally because the package author is no longer maintaining their package and the latter is because, again, the ML community is among the absolute worst at creating solid tooling.

csirac2 · on July 8, 2023

I don't disagree that Python's place in the ecosystem ("generalist" - i.e. load-bearing distro fossilization in everything from old binary linux distros, container layers, SIEM/SOAR products, serverless runtimes...) leads to much packaging complexity that R just doesn't have

However, Python (1991) is only 2 yrs older than R (1993)

bugglebeetle · on July 8, 2023

Oh, my bad. I though R was quite a bit younger.

disgruntledphd2 · on July 9, 2023

And R is a clone of S which is from 1973.

lanstin · on July 11, 2023

When people could design systems that work for long periods of time.

Waterluvian · on July 8, 2023

Rust and Node (via nvm) feel good. The worst I run into is “this version of node isn’t installed” and then I just add it. And I don’t have to worry about where dependencies are being found. Python likes to grab them from all over my OS.

js2 · on July 8, 2023

I use direnv and pyenv. When I cd to a repo/directory, the .envrc selects the correct Python and the directory has its own virtual environment into which I install any dependencies. I don't find that Python grabs packages from all over the OS.

Waterluvian · on July 8, 2023

Indeed there’s a good half-dozen options in Python for this kind of thing. But it depends on which one each project opts to use.

mvanbaak · on July 8, 2023

pyenv works locally, no matter what the project opts to use. The only thing it needs for a project 'to be managed' is a .py-version file, which you can throw in .gitignore

echelon · on July 8, 2023

It doesn't matter what you do. The vast majority of code I'm using from other people doesn't. Even my personal python methodology differs from yours.

Plus, you now have to teach and evangelize your method versus the dozens of others out there. It's crazy town.

The negative thoughts and feelings I once had for PHP are now directed mostly at Python. PHP fixed a lot of its problems over the last decade. Python has picked up considerable baggage in that time. It needs to take the time to do the same cleanup and standardization.

js2 · on July 8, 2023

> It doesn't matter what you do.

I was describing a workflow that works for me to someone who didn't seem to have found an effective Python workflow in hopes that it can work for them too. I work across a variety of languages and none that I've worked with doesn't have some issue that I can't complain about[1]. I personally don't find Python all that painful to work with (and I've been working with it since 1.5.2), but I understand my experience is not universal.

[1] If it's not the language, it's the dependency manager. If it's not the dependency manager, it's the error handing. If it's not the error handling, it's the build process. If it's not the build process, it's the community. If not the community, the tooling. Etc. I have some languages I like more and some less. Mostly it comes down to taste. I'm not here to apologize for or defend Python. I'm only here to describe how I use it effectively, and to correct what I thought were inaccuracies with respect to removing the GIL.

mvanbaak · on July 8, 2023

> When I cd to a repo/directory, the .envrc selects the correct Python

For this you dont neet .envrc and direnv, as this is handled perfectly fine by peen itself: pyenv local <pyenv / virtualenv name>

js2 · on July 8, 2023

pyenv explicitly does NOT manage virtual environments:

https://github.com/pyenv/pyenv

I use direnv because I work with many languages and repos and I don't want each language's version manager linked into my shell's profile. As well, direnv lets me control things besides the language version. Finally, direnv means I don't have to explicitly run any commands to set things up. I just cd to a directory.

WirelessGigabit · on July 8, 2023

Also rustup. If you send me a repo with a cargo.toml which references a specific version of rust it'll download it on the fly.

Its insane that nor Node nor Python have a first class version selector.

kzrdude · on July 8, 2023

FWIW, I don't think it's nice that rustup fetches and installs new versions without prompting, but I suppose that other users like it or get used to it. Fortunately most Rust projects work on any recently stable version.

super_flanker · on July 9, 2023

> rustup fetches and installs new versions without prompting

I don't think it's true. rustup installs new version only when you run `rustup update`. What parent is talking about is pinning a particular rustc version in Cargo.toml, which allows rustup to download that version of rustc to build that particular project/crate.

kzrdude · on July 9, 2023

rustup will automatically download that version when you interact with that project, though, and that's what I mean. It doesn't sit right with me, comes as a surprise, but I guess it's not the biggest issue in the world.

mmis1000 · on July 8, 2023

Node do allow you to declare what node version supported in your package.json. The definition is there, but there isn't any tool that read the declaration and switch to it accordingly. I feel it is somewhat half-assed. But is could also caused by the fact the entity that distribute the package (npm) and node binaries (various of linux repository) isn't the same group of people. So there isn't really anyone can do anything about it unless we get something like corepack someday. (probably someone should name it 'corenode' ?)

smrtinsert · on July 8, 2023

Isn't this all handled by pip typically? Even though most models don't necessarily put it in the readme, the user should be using some sort of env manager.

kzrdude · on July 8, 2023

rye is an experimental way to manage python installs and dependencies, it's inspired by both rustup and cargo.

__jem · on July 8, 2023

I mean, Java seems like a pretty good alternative? Obviously it's trivially true that programmers have to care about versions, but they've done miracles in the VM without breaking compatibility.

funnymony · on July 8, 2023

8 to 11 needed changes in application code.

AlphaSite · on July 9, 2023

It’s not the only time, the way string slicing worked also broke a lot of performance grantees.

eikenberry · on July 8, 2023

Pragmas seem like the correct way to have done the Python 2->3 migration. Does anyone know of some technical limitation as to why they weren't used? It is very obvious solution in hindsight, but I wasn't there.

takeda · on July 8, 2023

I saw some people mentioning changes like changing print statement to print function. That was actually one of the most trivial changes and you could import print_function from __future__ which worked like pragmas.

Similar problem could be with changing behavior for divisions (which actually was more challenging) but similarly you could enable that behavior.

The main problem with migration though was addition of Unicode. You can't just enable it on file by file basis, because once you enable the new behavior in a single file you, will start passing arguments in Unicode to other code in other files and if that code wasn't adapted or will break.

And it was even worse than that because that problem extended to your dependencies as well. Ideally dependencies should be updated first, then your application, but since python 2 was still supported (for a decade after python 3 was released) then there was no motivation to do it.

And if that wasn't enough python 2 already had Unicode support added, but that implementation was incorrect, so even if you imported Unicode_literals from __future__ you potentially broke compatibility with existing python 2 dependencies without guarantee that your code will work on python 3.

IMO that particular change couldn't be done with pragmas, the core issue is that python 3 put a clear separation between text and binary data, but Python 2 mangled them together. That still was true even when you used Unicode in python 2.

The proper way to perform the migration IMO would be to type annotate the code. And then run mypy check in python 3 mode.

bazoom42 · on July 8, 2023

Back when Python 3 was initially concieved, the language just wasn’t that widely used, and mostly by enthusiasts. Some breakage wasn’t considered a big deal - it was expected users would easily update their code.

But during the time it took to design and deliver Python 3, the language exploded in popularity and reached a much wider audience and 3rd party libraries like numpy became crucial. So when Python 3 was ready it was a completely different ecosystem which was much harder to migrate. But I dont think the core team really relized that before it was too late.

Nihilartikel · on July 8, 2023

Asdf makes all of this pretty easy. For consulting I often need multiple versions of everything to match client projects - Just install them with asdf and put a .toolversions file in the project folder with the desired tooling builds.

bick_nyers · on July 8, 2023

sudo apt install python3nogilispython3

sudo apt install python3ispython

tremon · on July 8, 2023

You would not need separate packages to do that (in fact, you can't do this with separate packages because dpkg will complain if two packages provide the same file).

  sudo update-alternatives --set python3 /usr/bin/python3-nogil
  sudo update-alternatives --set python /usr/bin/python3

takeda · on July 8, 2023

The PEP actually states that the nogil version would also have env variable allowing to temporarily enable GIL. Although I guess in practice they might still build separate versions.

EamonnMR · on July 8, 2023

It's probably going to be

/home/you/project/venv/bin/python

takeda · on July 8, 2023

It would be more like: does the code support nogil?

Because the code that is updated is expected to work with both versions.

kaba0 · on July 8, 2023

It is not the language’s task to solve. Use a proper dependency manager like nix.

meowface · on July 8, 2023

Or pyenv (https://github.com/pyenv/pyenv) if you don't want to take the plunge into something like nix.

As for managing Python library dependencies, I use poetry (https://python-poetry.org), though unfortunately both it and pipenv seem to progressively break functionality over time for some reason.

echelon · on July 8, 2023

venv, virtualenv, pipenv, pyvenv, venvwrapper, conda, ...

Python4 needs a hard reset.

whalesalad · on July 8, 2023

those are all means to the same end.

venv == virtualenv

virtualenvwrapper is ancient not rly used anymore

pyenv is a third party tool that makes some of this easier, notably around creating more than just a virtual env in that you also choose the Python version.

Python is not hard to deal with in this regard I think people are just uninformed.

theLiminator · on July 8, 2023

This is Stockholm syndrome

whalesalad · on July 8, 2023

It’s one of the most popular programming languages in the world for a reason.

smabie · on July 8, 2023

And the terrible package management story is not one of those reasons

whalesalad · on July 8, 2023

correct, because it’s not terrible.

lanstin · on July 11, 2023

I routinely help data scientist people with mangled local python installs. It could be a full time job but a really bad one.

slt2021 · on July 8, 2023

how about:

pip install

pip install -u

sudo pip install

conda install

sudo conda install

some packages require one, are fine with other with few small warnings, and dont work with third way of installing.

whalesalad · on July 8, 2023

If you’re installing a Python package into the global site packages directory (ie, into the system Python) you might need sudo. That’s how permissions work.

I don’t know the -u flag on pip, never used it can’t find it in the docs.

With a virtual environment sudo is not needed. Assuming you created it, and/or it is owned by you.

Virtual environments are just directories on disk. They are not complex.

I don’t use conda because it’s never felt even remotely necessary to me.

pip and a requirements file is all you need.

slt2021 · on July 8, 2023

-u flag is short for --user ( https://stackoverflow.com/questions/42988977/what-is-the-pur... )

how about when you are authoring script under your name, but then want to schedule it for cron to run periodically?

I often find myself working under my user on remote server, but then I want to schedule cron job - and run into all sorts of permissions / bugs and missing packages.

especially when multiple machines need to run this script, and I don't want to involve containers to run 20-lines simple python script.

this is why Golang is so popular - you can just scp a single binary across machines and it will just work.

wombatpm · on July 8, 2023

I’ve started packaging up my clients python scripts as docker images. Works great for cron tasks and updates/rollbacks are a breeze

slt2021 · on July 8, 2023

I also forget a out:

1. pip install (with/out --user flag )

2. pip3 install (with/out --user flag )

3. sudo pip install

4. sudo pip3 install

5. conda install

6. sudo conda install

cpeterso · on July 8, 2023

And because conda wasn’t enough, there are mamba and micromamba rewrites of conda in C++.

coldtea · on July 8, 2023

Are you kidding me? The horrendous way Python does dependency management and virtual environments, and the fractured ecosystem around those, is one of it biggest pain points, often covered by core CPython developers and prominent Python third party developers, hardly "misinformed" people.

https://xkcd.com/1987/

whalesalad · on July 8, 2023

That comic is very old. In the days of 2.x it was a little harrier but nothing like people make it out to be.

The literal only thing you need to understand is “sys.path”. If you inspect this in a shell you will know what you’re up against. Python is all literally just directories. It’s so easy and yet people get so bent out of shape over it.

Create a venv, activate it, and use pip as normal. If you ever run into issues, look at sys.path. That’s it.

coldtea · on July 8, 2023

>That comic is very old.

And few things have changed since then.

whalesalad · on July 8, 2023

I’m gonna have to disagree with ya there bud. Never been easier to be a Python developer.

coldtea · on July 8, 2023

Which is irrelevant. We're talking about the dependencies/packaging/virtual environments situation, not whether "it's easy to be a Python developer" in general.

And you can disagree all you want, but it's simply wrong that Python's packaging/venv ecosystem is "just fine".

beowulfey · on July 8, 2023

There are options to do things other ways, but most of the time I just use venvs and pip for everything.

Is it because people have to use venvs that people complain about it?

I’ll admit being able to install via an OS package manager, vs pip, vs anaconda etc etc can be confusing, but is any of that really Python (the language)’s fault?

replygirl · on July 8, 2023

more like which python3-compat

Der_Einzige · on July 8, 2023

Just use conda/mamba unironically. Solves all of these problems. Venv and other solutions should be depreciated already.

riazrizvi · on July 8, 2023

I think the primary problem with removing the GIL is the performance from degraded garbage collection. To quote the issue with an implication that it isn’t a big deal because it’s not many words, doesn’t make the case for me. The delay in when garbage collection happens is the only reason to my mind why Java sucks balls as a memory hog vs C-based languages. Process performance is in large part based on when critical lowest latency memory resources can be freed for the top level working object set.

ernst_klim · on July 9, 2023

It's much worse than 2→3 transition as it seems to me.

Python is a very slow language whose main advantage is relatively cheap and easy FFI with bindings to all imaginable library already written.

With breaking FFI you'll lose all the libraries while adding SMP to still slow language. It's a cleare loss of value.

bilsbie · on July 8, 2023

Could it be opt in at run time though?

ameliaquining · on July 8, 2023

I think the hope is to get there eventually, but since it's a big change, they're doing it one step at a time.

OOPMan · on July 8, 2023

Agingcoder · on July 8, 2023

It’s an interesting debate ( money aside ). In the past 15 years, at least to my eyes, one of the most important changes in soft eng culture has been the now systematic search for correctness. Systematic, organized, large scale Testing is one side of the coin, but the real value , and significantly more important change to me is guarantees/proofs : static analysis, stronger type systems, etc. So type annotations in Python, typescript, rust instead of c++, etc. Tools as well : golang race detector, infer, coverity, pvs studio, code peer (Ada)/spark, compcert, etc.

In this particular case, I understand that removing the Gil would create potential new risks ( but provide better performance) and although tooling is mentioned, it’s not officially part of the plan, so it feels like a regression of some sorts.

klysm · on July 8, 2023

It might be worth it though because somehow python had ended up being the lingua Franca in a many applications where performance matters a great deal (ML, scientific applications, etc). I don’t think the performance benefits for something deployed at such a large scale should be understated.

local_crmdgeon · on July 8, 2023

Indeed and to add - the environmental impact of Python’s speed is nontrivial and growing

klysm · on July 8, 2023

Although you are almost certainly correct, I’m curious if there is a case where faster doesn’t mean more power efficient I.e. more concurrency but more power overall

H8crilA · on July 9, 2023

Concurrent execution is ~always less power efficient than serial execution. I have no idea what the parent comments are talking about. For almost any work X with time to run on a single core T it will run on two cores in time larger than T/2, and on three cores in time larger than T/3, ... This is due to synchronization overhead (e.g. networking, locks, delays) and also often due to shared resources getting saturated (e.g. VRAM on a GPU).

That is unless in the serial execution you still have to pay for some resources that are unutilized.

tornato7 · on July 11, 2023

However, if you're running a computer with a 16-core CPU, power usage doesn't scale linearly with cores. There's a lot of overhead, especially if you're talking about a laptop/desktop with display, HID, etc.

IshKebab · on July 8, 2023

I don't think that has really been a change in the past 15 years. Well, maybe there was just a lull in people caring about correctness and robustness... But it's not like before 15 years ago everyone was using dynamically typed languages and static analysers didn't exist.

I guess you could say the world went through a dynamically typed loosy goosy phase and then realised that wasn't so great.

I think one thing that actually has changed is that SMT solvers have massively advanced in the last 20 years to the point that formal verification is practical. Ish anyway. It still seems to require a PhD to do software formal verification. Hardware formal verification is easy though.

hgs3 · on July 8, 2023

PEP 703 mentions the multiprocessing module which I used recently with great success. It is essentially message passing parallelism, i.e. the actor model. I would think even with the GIL removed message passing is still the best approach for a higher-level language, like Python, to handle parallelism as it is conceptually simple and lock free.

The primary disadvantage of the multiprocessing module is it uses separate processes which means my resource heavy Python extension needs to be duplicated multiple times in memory (at least on Windows). I think the move to supporting threads means the module could leverage them instead of separate processes, avoiding the need to duplicate C extensions in memory, and thereby reducing the overall memory pressure on a system.

packetlost · on July 8, 2023

I agree that message passing can be a really good choice. The problem is the overhead of passing large messages can be really substantial. In general, you'll need to store messages in memory 4x per message (at least temporarily): once on each end for the Python representation and the serialized "on the wire" representation. There's also the issue with `pickle` generally being a horrible serialization format and being the default.

There are ways around this, Python has a rather rudimentary and archaic shared memory subsystem[0] that can make this much more efficient.

On the other hand, the Python multiproccessing sync primitives actually work over a network with almost no code changes, so you can horizontally scale with little effort... but then you're paying the serialization and network overhead, so ymmv.

[0]: https://docs.python.org/3/library/multiprocessing.shared_mem...

tga_d · on July 8, 2023

I have no experience with Windows, but at least on Linux, if the forking is handled intelligently (namely, late enough that identical assets don't need to be re-written), duplicated memory should be handled by Copy on Write.

sanxiyn · on July 8, 2023

Have you tried it? You are right in general, but even if you don't write anything in Python, CPython runtime constantly writes to heap, so everything is copied. You need specific support from CPython runtime to exploit copy-on-write.

This is documented: https://docs.python.org/3/library/gc.html#gc.freeze

For details, see "Copy-on-write friendly Python garbage collection" by Instagram Engineering: https://instagram-engineering.com/copy-on-write-friendly-pyt...

tga_d · on July 8, 2023

I guess when GP said "heavy ... C extensions" I was thinking code pages from libraries and structures managed by C, yes python objects will require additional consideration.

jmuhlich · on July 8, 2023

CPython still needs to update reference counts after the fork even if your python code itself doesn't modify any data, making Copy on Write less helpful than it would first appear. Monolithic memory buffers like numpy arrays can be shared efficiently via CoW, but rich data structures with many PyObjects suffer from the refcounting issue.

gjulianm · on July 8, 2023

Multiprocessing in Python isn't exactly like forking on other languages. There's no shared Python memory, it's just a new interpreter instance, and then you have to use sharing mechanisms (shared arrays/values, pipes, queues) to transfer data explicitly.

tga_d · on July 8, 2023

I know, but the entire point of CoW is you don't need shared memory, as long as the fork happens after the memory was written it'll just act like an entirely independent copy transparently to the application, with the OS duplicating the backing page on the first write.

Edit: Ah I see, you're saying the library itself discards anything created from python. That sounds like an implementation problem then, I'd expect it to at least keep imports so any static code pages don't need to be recreated (the interpreter itself should be fine at least, unless it's doing something really dumb).

shepardrtc · on July 8, 2023

Multiprocessing is great. Very easy to use and with the Manager(), passing data is easy enough, though very "slow" relatively speaking. It has to serialize the data, send it to the Manager process, then send to all the processes (I believe), and then deserialize it. Perhaps this nogil update can speed that up?

gjulianm · on July 8, 2023

My experience with the multiprocessing module is exactly opposite. It's been a few years since I last used it seriously (I still use Python a lot) mainly because of how bad it was. The main issue is precisely communication between processes: Python uses the pickle module to serialize/deserialize data, which not only adds quite a bit of performance cost, but it's very hard to debug when for some reason some object is not supported by pickle (and it's a common occurrence).

Kwpolska · on July 8, 2023

> If PEP 703 is accepted, Meta can commit to support […] between the acceptance of PEP 703 and the end of 2025

Meanwhile, the Steering Council is dragging their feet, even though so many people in the community have chimed in their support for nogil being a thing (in the linked thread and two others).

https://github.com/python/steering-council/issues/188#issuec...

lagt_t · on July 8, 2023

This is a decision with huge ramifications and the PEP was submitted only 5 months ago. Caution is welcomed here. Specially with PEP 684 around the corner.

Kwpolska · on July 8, 2023

As @colesbury said in the GitHub thread:

> The PEP was posted five months ago, and it has been 20 months since an end-to-end working implementation (that works with a large number of extensions) was discussed on python-dev.

I would expect the SC to be active participants in the community, including the python-dev mailing list, so they should have known what this change means by that point.

santiagobasulto · on July 8, 2023

There's a great talk by David Beazley (I don't remember which one) that he basically explains that removing the GIL isn't difficult at all. After all, it's just one lock.

The issue is all the libraries and packages that were built around it. It's now been YEARS of building on top of the GIL.

So anyways, PEP 703 is a a nice effort, but I doubt we (everyday mortals) can enjoy it. Meta, and big companies with a specialized team might be able to exploit it by making sure their entire stack is GIL-free.

EDIT: My bad, the talk is from Larry Hastings: https://www.youtube.com/watch?v=P3AyI_u66Bw

gjulianm · on July 8, 2023

> Meta, and big companies with a specialized team might be able to exploit it by making sure their entire stack is GIL-free.

Not really. Most of the popular libraries will have nogil versions ready for sure (numpy, pandas, pytorch, etc). Server applications will probably benefit from being able to serve multiple requests in a real multithreaded environment without many changes (in fact, I bet a lot of HTTP libraries will see simplifications due to not needing to deal with processes for real parallelism). Even packages without changes will probably have at least a small degree of thread safety (for example, two instances of the same object should be independent and could be used in different threads) that allows users to still leverage parallelism.

thelastbender12 · on July 8, 2023

> So anyways, PEP 703 is a a nice effort, but I doubt we (everyday mortals) can enjoy it.

I'd suggest we can be more optimistic. There is a lot of python, regular users write daily, where we delegate orchestration to established libraries - asyncio/web-frameworks/pytorch. The GIL limits how much they can parallelize your code, and its removal will help with that.

kzrdude · on July 8, 2023

Sam Gross has worked vs numpy and other modules to make it work well in practice

klysm · on July 8, 2023

“Just one lock” is kind of misleading if decades of software has accumulated that is predicated on its existence

tomwojcik · on July 8, 2023

2016. A lot has changed since then and AFAIK numpy and other most popular libs with bindings in C have an experimental branch that is already nogil compatible.

Not saying he isn't wrong, though.

mihaic · on July 8, 2023

Over time I've changed my mind regarding the GIL. Now I think it's existence actually is a good thing, since it makes a lot of things easy and robust by default.

The subinterpreters route sounds really promising since it strikes a good balance between allowing those that need same-process parallelization to be able to do so and keeping the status quo (like workers in Javascript).

I'd actually veto PEP 703 at this point, since the Python optimization team is making good progress, and this could actually take them back a bit, while diverting too much energy into a low-reward avenue.

nocman · on July 8, 2023

Might be worth changing the title to "no-GIL". My first thought was "What is nogil CPython?" - thinking it was some specialized fork of CPython. I guess you could say that is technically true. I was previously aware of the GIL in Python, yet it still didn't occur to me immediately.

dig1 · on July 8, 2023

Whatever happens, I'm hoping we won't see another 2to3-like migration. I moved from Python a long time ago due to its multi-threaded limitations (among other things) to Java and JVM, and I really appreciate the stability the JVM team tries to maintain. Things are not perfect, but I'd take stability/compatibility over perfection any day.

atdt · on July 8, 2023

At least take a look at PEP-703[1]. It will answer most of your questions. Significantly, even if this proposal is accepted, the GIL would remain enabled by default.

  [1]: https://peps.python.org/pep-0703/

whoknowswhat11 · on July 8, 2023

Could not agree more. The purposeful unnecessary breaking stuff was maddening- and then being lectured at like you were an idiot more so. No more u”” for you!

atdt · on July 8, 2023

There is zero chance of that happening. Guido van Rossum (and other key figures in the Python community) have spoken very candidly for years about the mistakes they made in the 2-to-3 migration and their decisions in the year since have demonstrated their commitment to not repeating these mistakes.

blamazon · on July 8, 2023

I was curious about these comments and did a small web search. I found a video interview from May 21, 2021 [1] and have pasted some excerpted quotations from it from Guido van Rossum below for others who are curious.

“Python 4, at this point whenever it’s mentioned in the core development team, it is very much as a joke… We’ve learned our lesson from Python 3 vs 2, and so it’s almost taboo to talk about a Python 4 in a serious sense.”

[...]

“I normally talk about that as a mistake, because Python was more successful than the core developers realised and so we should have been much more aware and supportive of transitioning from Python 2 to Python 3”

[...]

“I’m not thrilled about the idea of Python 4 and nobody in the core dev team really is – so probably there never will be a 4.0 and we’ll just keep numbering until 3.33, at least”

[...]

“We now have a strict annual release schedule, so after 3.10 will be 3.11 and after that will be 3.12, and so forth. We can go up to 3.99 before we have to add another digit. Adding another digit is not completely trivial, but still much better than going from 3 to 4."

[1]: https://www.youtube.com/live/aYbNh3NS7jA

klysm · on July 8, 2023

I’d be very surprised if they botched it that hard. It’s in the back of the minds of everybody any time a significant change to Python comes up

kkirsche · on July 8, 2023

Seeing so many comments about how hard it is to just break compatibility and upgrade is sad. Instead of just throwing our hands up and saying it’s too hard, we could adopt the model the JavaScript ecosystem has seen more of which is codemods that upgrade the code for us.

If as a community we invest in those tools and make them easier to build, the cost of upgrading goes down and the velocity of high-impact changes can increase.

inglor · on July 8, 2023

TC39 has a famous "don't break the internet" mantra. Even the leeway that Python has with deprecations/features JavaScript doesn't. It's versionless, code automatically gets updated to whatever the browser is using.

JavaScript evolves quickly but so does Python!

(Note that your approach is exactly what they said in the 2 to 3 transiton btw with a special tool that didn't work too well)

kkirsche · on July 9, 2023

That’s not what that library was in my opinion, it was a compatibility layer not a rewriting tool which is what I referenced. Having a layer in between simply prolongs the issue and creates many types of problems based on adoption or not. On the other hand, when rewriting you can apply that either as an author or as an end user depending on the quality which meaningfully allows for different results.

mgsouth · on July 9, 2023

codemod is a syntax conversion tool, using regexes. Thread-safety isn't even semantic--it's an emergent behavior question, the same as "does this code halt?" There is no general solution.

For example, is this code thread-safe?

    foo(int* x) {
        int z;
        for (int* y = x + 1; y != 0 && *y < *(y - 1); ++y) {
          z = *y;
        }
        return z;
    }

You can't tell from static analysis of the function. It depends upon what guarentees are imposed upon the passed-in "x" value. For example, if "foo" is only referenced as a function pointer passed to "baz" (also in the library), and "baz" creates "x" and uses it in a thread-safe manner, then there's no problem. But there's no guarenteed mechanical way to determine if "baz" is indeed doing the right thing, or what changes should be made to make it so.

closeparen · on July 8, 2023

A fully general transformation from naive to thread-safe code seems like it would make you one of the giants of computer science alongside Knuth and Dijkstra.

local_crmdgeon · on July 8, 2023

You don’t need that though - you can get a long way with improved tooling and devex