summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJonathan Gray <jsg@cvs.openbsd.org>2021-07-22 09:57:55 +0000
committerJonathan Gray <jsg@cvs.openbsd.org>2021-07-22 09:57:55 +0000
commit10290b3b59f0f40ee1ffa12bcfc521e533d5407d (patch)
treef867569632a42a7bc5bc4e0ec0b37a8f0bca816d
parent9f777f310e69dca2cb102448ecbe15b99d1214ed (diff)
Import Mesa 21.1.5
-rw-r--r--lib/mesa/docs/dispatch.rst72
1 files changed, 60 insertions, 12 deletions
diff --git a/lib/mesa/docs/dispatch.rst b/lib/mesa/docs/dispatch.rst
index c2942bf90..cd1ca3434 100644
--- a/lib/mesa/docs/dispatch.rst
+++ b/lib/mesa/docs/dispatch.rst
@@ -78,8 +78,9 @@ The problem with this simple implementation is the large amount of
overhead that it adds to every GL function call.
In a multithreaded environment, a naive implementation of
-``GET_DISPATCH()`` involves a call to ``_glapi_get_dispatch()`` or
-``_glapi_tls_Dispatch``.
+``GET_DISPATCH`` involves a call to ``pthread_getspecific`` or a similar
+function. Mesa provides a wrapper function called
+``_glapi_get_dispatch`` that is used by default.
3. Optimizations
----------------
@@ -89,15 +90,48 @@ performance hit imposed by GL dispatch. This section describes these
optimizations. The benefits of each optimization and the situations
where each can or cannot be used are listed.
-3.1. ELF TLS
+3.1. Dual dispatch table pointers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The vast majority of OpenGL applications use the API in a single
+threaded manner. That is, the application has only one thread that makes
+calls into the GL. In these cases, not only do the calls to
+``pthread_getspecific`` hurt performance, but they are completely
+unnecessary! It is possible to detect this common case and avoid these
+calls.
+
+Each time a new dispatch table is set, Mesa examines and records the ID
+of the executing thread. If the same thread ID is always seen, Mesa
+knows that the application is, from OpenGL's point of view, single
+threaded.
+
+As long as an application is single threaded, Mesa stores a pointer to
+the dispatch table in a global variable called ``_glapi_Dispatch``. The
+pointer is also stored in a per-thread location via
+``pthread_setspecific``. When Mesa detects that an application has
+become multithreaded, ``NULL`` is stored in ``_glapi_Dispatch``.
+
+Using this simple mechanism the dispatch functions can detect the
+multithreaded case by comparing ``_glapi_Dispatch`` to ``NULL``. The
+resulting implementation of ``GET_DISPATCH`` is slightly more complex,
+but it avoids the expensive ``pthread_getspecific`` call in the common
+case.
+
+.. code-block:: c
+ :caption: Improved ``GET_DISPATCH`` Implementation
+
+ #define GET_DISPATCH() \
+ (_glapi_Dispatch != NULL) \
+ ? _glapi_Dispatch : pthread_getspecific(&_glapi_Dispatch_key)
+
+3.2. ELF TLS
~~~~~~~~~~~~
Starting with the 2.4.20 Linux kernel, each thread is allocated an area
of per-thread, global storage. Variables can be put in this area using
-some extensions to GCC that called `ELF TLS`. By storing the dispatch table
-pointer in this area, the expensive call to ``pthread_getspecific`` and
-the test of ``_glapi_Dispatch`` can be avoided. As we don't support for
-Linux kernel earlier than 2.4.20, so we can always using `ELF TLS`.
+some extensions to GCC. By storing the dispatch table pointer in this
+area, the expensive call to ``pthread_getspecific`` and the test of
+``_glapi_Dispatch`` can be avoided.
The dispatch table pointer is stored in a new variable called
``_glapi_tls_Dispatch``. A new variable name is used so that a single
@@ -109,11 +143,22 @@ reference.
.. code-block:: c
:caption: TLS ``GET_DISPATCH`` Implementation
- extern __THREAD_INITIAL_EXEC struct _glapi_table *_glapi_tls_Dispatch;
+ extern __thread struct _glapi_table *_glapi_tls_Dispatch
+ __attribute__((tls_model("initial-exec")));
#define GET_DISPATCH() _glapi_tls_Dispatch
-3.2. Assembly Language Dispatch Stubs
+Use of this path is controlled by the preprocessor define
+``USE_ELF_TLS``. Any platform capable of using ELF TLS should use this
+as the default dispatch method.
+
+Windows has a similar concept, and beginning with Windows Vista, shared
+libraries can take advantage of compiler-assisted TLS. This TLS data
+has no fixed size and does not compete with API-based TLS (``TlsAlloc``)
+for the limited number of slots available there, and so ``USE_ELF_TLS`` can
+be used on Windows too, even though it's not truly ELF.
+
+3.3. Assembly Language Dispatch Stubs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Many platforms have difficulty properly optimizing the tail-call in the
@@ -133,17 +178,20 @@ different methods that can be used:
environments.
#. Using ``_glapi_Dispatch`` and ``_glapi_get_dispatch`` in
multithreaded environments.
+#. Using ``_glapi_Dispatch`` and ``pthread_getspecific`` in
+ multithreaded environments.
#. Using ``_glapi_tls_Dispatch`` directly in TLS enabled multithreaded
environments.
People wishing to implement assembly stubs for new platforms should
-focus on #3 if the new platform supports TLS. Otherwise implement #2.
-Environments that do not support multithreading are
+focus on #4 if the new platform supports TLS. Otherwise, implement #2
+followed by #3. Environments that do not support multithreading are
uncommon and not terribly relevant.
Selection of the dispatch table pointer access method is controlled by a
few preprocessor defines.
+- If ``USE_ELF_TLS`` is defined, method #3 is used.
- If ``HAVE_PTHREAD`` is defined, method #2 is used.
- If none of the preceding are defined, method #1 is used.
@@ -189,7 +237,7 @@ dispatch functions from being built.
.. _fixedsize:
-3.3. Fixed-Length Dispatch Stubs
+3.4. Fixed-Length Dispatch Stubs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To implement ``glXGetProcAddress``, Mesa stores a table that associates