Import Mesa 21.1.5

author: Jonathan Gray <jsg@cvs.openbsd.org> 2021-07-22 09:57:55 +0000
committer: Jonathan Gray <jsg@cvs.openbsd.org> 2021-07-22 09:57:55 +0000
commit: 10290b3b59f0f40ee1ffa12bcfc521e533d5407d (patch)
tree: f867569632a42a7bc5bc4e0ec0b37a8f0bca816d
parent: 9f777f310e69dca2cb102448ecbe15b99d1214ed (diff)
1 files changed, 60 insertions, 12 deletions
diff --git a/lib/mesa/docs/dispatch.rst b/lib/mesa/docs/dispatch.rst
index c2942bf90..cd1ca3434 100644
--- a/lib/mesa/docs/dispatch.rst
+++ b/lib/mesa/docs/dispatch.rst
@@ -78,8 +78,9 @@ The problem with this simple implementation is the large amount of
 overhead that it adds to every GL function call.
 
 In a multithreaded environment, a naive implementation of
-``GET_DISPATCH()`` involves a call to ``_glapi_get_dispatch()`` or
-``_glapi_tls_Dispatch``.
+``GET_DISPATCH`` involves a call to ``pthread_getspecific`` or a similar
+function. Mesa provides a wrapper function called
+``_glapi_get_dispatch`` that is used by default.
 
 3. Optimizations
 ----------------
@@ -89,15 +90,48 @@ performance hit imposed by GL dispatch. This section describes these
 optimizations. The benefits of each optimization and the situations
 where each can or cannot be used are listed.
 
-3.1. ELF TLS
+3.1. Dual dispatch table pointers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The vast majority of OpenGL applications use the API in a single
+threaded manner. That is, the application has only one thread that makes
+calls into the GL. In these cases, not only do the calls to
+``pthread_getspecific`` hurt performance, but they are completely
+unnecessary! It is possible to detect this common case and avoid these
+calls.
+
+Each time a new dispatch table is set, Mesa examines and records the ID
+of the executing thread. If the same thread ID is always seen, Mesa
+knows that the application is, from OpenGL's point of view, single
+threaded.
+
+As long as an application is single threaded, Mesa stores a pointer to
+the dispatch table in a global variable called ``_glapi_Dispatch``. The
+pointer is also stored in a per-thread location via
+``pthread_setspecific``. When Mesa detects that an application has
+become multithreaded, ``NULL`` is stored in ``_glapi_Dispatch``.
+
+Using this simple mechanism the dispatch functions can detect the
+multithreaded case by comparing ``_glapi_Dispatch`` to ``NULL``. The
+resulting implementation of ``GET_DISPATCH`` is slightly more complex,
+but it avoids the expensive ``pthread_getspecific`` call in the common
+case.
+
+.. code-block:: c
+   :caption: Improved ``GET_DISPATCH`` Implementation
+
+   #define GET_DISPATCH() \
+       (_glapi_Dispatch != NULL) \
+           ? _glapi_Dispatch : pthread_getspecific(&_glapi_Dispatch_key)
+
+3.2. ELF TLS
 ~~~~~~~~~~~~
 
 Starting with the 2.4.20 Linux kernel, each thread is allocated an area
 of per-thread, global storage. Variables can be put in this area using
-some extensions to GCC that called `ELF TLS`. By storing the dispatch table
-pointer in this area, the expensive call to ``pthread_getspecific`` and
-the test of ``_glapi_Dispatch`` can be avoided. As we don't support for
-Linux kernel earlier than 2.4.20, so we can always using `ELF TLS`.
+some extensions to GCC. By storing the dispatch table pointer in this
+area, the expensive call to ``pthread_getspecific`` and the test of
+``_glapi_Dispatch`` can be avoided.
 
 The dispatch table pointer is stored in a new variable called
 ``_glapi_tls_Dispatch``. A new variable name is used so that a single
@@ -109,11 +143,22 @@ reference.
 .. code-block:: c
    :caption: TLS ``GET_DISPATCH`` Implementation
 
-   extern __THREAD_INITIAL_EXEC struct _glapi_table *_glapi_tls_Dispatch;
+   extern __thread struct _glapi_table *_glapi_tls_Dispatch
+       __attribute__((tls_model("initial-exec")));
 
    #define GET_DISPATCH() _glapi_tls_Dispatch
 
-3.2. Assembly Language Dispatch Stubs
+Use of this path is controlled by the preprocessor define
+``USE_ELF_TLS``. Any platform capable of using ELF TLS should use this
+as the default dispatch method.
+
+Windows has a similar concept, and beginning with Windows Vista, shared
+libraries can take advantage of compiler-assisted TLS. This TLS data
+has no fixed size and does not compete with API-based TLS (``TlsAlloc``)
+for the limited number of slots available there, and so ``USE_ELF_TLS`` can
+be used on Windows too, even though it's not truly ELF.
+
+3.3. Assembly Language Dispatch Stubs
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 Many platforms have difficulty properly optimizing the tail-call in the
@@ -133,17 +178,20 @@ different methods that can be used:
    environments.
 #. Using ``_glapi_Dispatch`` and ``_glapi_get_dispatch`` in
    multithreaded environments.
+#. Using ``_glapi_Dispatch`` and ``pthread_getspecific`` in
+   multithreaded environments.
 #. Using ``_glapi_tls_Dispatch`` directly in TLS enabled multithreaded
    environments.
 
 People wishing to implement assembly stubs for new platforms should
-focus on #3 if the new platform supports TLS. Otherwise implement #2.
-Environments that do not support multithreading are
+focus on #4 if the new platform supports TLS. Otherwise, implement #2
+followed by #3. Environments that do not support multithreading are
 uncommon and not terribly relevant.
 
 Selection of the dispatch table pointer access method is controlled by a
 few preprocessor defines.
 
+-  If ``USE_ELF_TLS`` is defined, method #3 is used.
 -  If ``HAVE_PTHREAD`` is defined, method #2 is used.
 -  If none of the preceding are defined, method #1 is used.
 
@@ -189,7 +237,7 @@ dispatch functions from being built.
 
 .. _fixedsize:
 
-3.3. Fixed-Length Dispatch Stubs
+3.4. Fixed-Length Dispatch Stubs
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 To implement ``glXGetProcAddress``, Mesa stores a table that associates
author	Jonathan Gray <jsg@cvs.openbsd.org>	2021-07-22 09:57:55 +0000
committer	Jonathan Gray <jsg@cvs.openbsd.org>	2021-07-22 09:57:55 +0000
commit	10290b3b59f0f40ee1ffa12bcfc521e533d5407d (patch)
tree	f867569632a42a7bc5bc4e0ec0b37a8f0bca816d
parent	9f777f310e69dca2cb102448ecbe15b99d1214ed (diff)