WL#4210: new storage layer for QC page allocation + value adding monitor
Affects: Connector/.NET-5.2 — Status: In-Progress — Priority: Medium
The Query Cache memory allocation routines are not well suited for a modern multicore architecture and big fragmented caches. Computer science has evolved new and better algorithms which are better suited to do the job faster and with more efficient code. The new design will be scalable to larger systems with much memory available (> 1GB) and also support small embedded system where memory is precious. This is one of two worklogs outlining the work needed to be done. This worklog outlines how the storage layer and page allocater should be designed and implemented. As a bonus the new algorithms are regulated with parameters which can be manually configured or handled automatically as a value added service through the proxy, the monitor or as a pluggable server module. What are the top benefits of the new design? * Better scalability. * More efficient memory usage. * Faster online maintenance operations. * Non uniform memory access gives faster operation. * Concurrent readers from the same result set prevents duplicate execution. * Implemented as pluggable modules which can be used in differentiation.
== Storage layer components == The query cache is composed of modules: The server interface (API), the storage layer (uses page allocator), the look up service (uses hash containers). This worklog addresses the storage layer. NOTE: the look up service is described in a separate WL == New SELECT or inserting into the cache == 1. The look up service has determined that the requested statement should be cached. As a result we have the opened tables, the parsed statement data, the current environment. We register the statement for caching with a result set writer and execute it normally to get the result set. 2. The query cache object is allocated from the page allocater and initialized with the statement text string, the current database and significant environment variables. The query cache is marked as 'running' (or partial) and made available to the look up service. 3. The result set is copied as packages into the storage layer. The storage layer allocates memory from the page allocator. 4. The result writer returns rows which are put into the current allocated page until it overflows and a new page is requested. All pages are linked and attached to the query cache object. 5. Consolidation phase: The last and probably only _partially_ full page, will be merged with other partially full pages of a similar size (2^n partitioning). The reasons for this is to save memory. The minimum page that can be consolidated is tunable and this allow us to scale for embedded and large systems. 6. When the result set is complete the query cache object is marked as 'complete'. == Cached SELECT or retrieving from the cache == 1. The lookup service has determined that the requested statement is in the cache. As a result a pointer to the first allocated page is returned. This is also the same thing as the query cache object which can be retrieved at this time. 2. The results can be read from the cache and returned to the client immediately or waited upon if the result is not complete yet. == Partially cached SELECT or waiting for the result set to finish == 1. The lookup service has determined that the requested statement is in the cache but it holds only a partial result. The current thread waits on a condition variable of the query cache object currently collecting the unfinished result set. 2. The thread belonging to the running (partial) query cache object signals all waiting threads whenever a result set package (rows) is delivered. This package can then be immediately read by all waiting client threads and this process can be repeated until the entire result set has been processed. == Invalidating == API: Invalidate( TABLE_LIST ) 1. For each element in the TABLE_LIST: Find the corresponding QC-table object through the look up service (example: table hash) 2. Increase the instantiated number (example: table modification time) of the QC-table object found by the look up service. This number will be used to determine a cache hit. TODO> explain how this system enable us to use fine grained locking and remove a top level mutex which is used in the current design. == Page allocator == The page allocator is a page cache. The API is simple and aims to maintain a group of free pages of fixed size: 1. free_page: Return a page to the allocator. 2. get_page: Handing out free pages for use as needed. The page allocated structure is an array of pointers. Each pointer points to a free page of fixed size. There are five supporting parameter: Low watermark, high watermark, total pages count, a head and a tail. This topology is also known as a ring buffer. The high- and low- watermark is used to determine whether pages should be returned to the OS kernel or if new pages should be allocated. The head and tail should be implemented as an atomic "add and return" to optimize the access speed in a concurrent environment.
Copyright (c) 2000, 2015, Oracle Corporation and/or its affiliates. All rights reserved.