#!/usr/bin/fmt

The port of the CQL source code to the Scidvspc chess database application as a fully integrated search engine was undertaken by Lionel Hampton.  This file outlines the porting process, issues encountered along the way, and guidelines for subsequent porting efforts of future releases of the CQL source distribution.


The original standalone CQL executable utilized a subset of a very dated revision of the Scid source tree, primarily for parsing PGN files.  Because that parser naturally filled in Scid-native data structure as the games were parsed, the CQL engine also happily utilizes those same Scid-native data structures.  That convenience, alone, renders the port nearly trivial.

The first task in the port involved transitioning the CQL source base away from that ancient Scid code base to the more current v4.18 revision of the Scidvspc source repository, in order to build a functional standalone cql executable off the same source as would eventually be utilized by the ported integrated CQL engine.  That primarily involved minor tweaks/additions to some Scid header files and the addition of a couple of accessor attributes for data elements which had been moved from private to public space by C&S.  It also required the creation and initialization of an integrated Tcl interpreter due to the fact that the current PGN parser now pulls in the CHARSETCONV objects which in turn require a number of references out of the Tcl library.

The standalone cql executable resulting from the clean build was then tested and verified by comparing the query results from that exec -- against a 100K game PGN DB -- with the respective results produced by the original cql executable, utilizing the many CQL examples authored by Costeff & Stiller (found in the examples sub-directory).

Having a fully functional base built aganst the native headers and objects, the next step was to integrate the CQL engine into the Scid executable.  Since the CQL base already interfaced with the Scid Game:: and Position:: infrastructure, very little bridging work was required.  Rather than searching against the game stream generated by the parser, the search is conducted against the active scid database game list filter.

Following are the non-trivial aspects of the integration effort:

-- Convert the CQL lexical analyzer to work off an internal buffer rather that a file.  That work can be found in lexer.cpp.

-- Create a new search menu item and search window for feeding the CQL engine the necessary CQL syntax.  The search window rendered and managed by proc ::search::cql{} is modelled after proc ::search::moves{}.

-- Create a new Tcl command serving as an interface to the CQL engine, iterating through the game filter and invoking the CQL engine on a per/game basis.  The sc_search_cql() command is modelled after the sc_search_moves() command, and interfaces with the CQL engine through a surrogate function found in parser.cpp, which simply determines whether or not a given game matches the CQL filter-set supplied by the UI.

-- Rework the native CQL exception handling scheme, since any exception occuring in processing the games would result in the standalone application simply printing an error message and bailing out.  The ported/integrated engine relays error and diagnostic messages via a couple of global vars and performs a longjump() out of the nest of stack frames.  The use of the C++ std::setjmp()/std::longjmp() duo should be inconsequential since virtually all allocations occur on the heap rather than the stack.

-- Replace altered games in the database.  Fortunately, this was a one-liner wrt the CQL code. Significantly more than that in sc_search_cql(). Altered games include those which have had their position match marks stripped in accordance with a user-controlled radio switch.

-- Resolve memory leaks:
  ** The CQL engine was not originally intended to be a long-lived entity.  Objects allocated on the heap were never deleted in the original code.  That has generously been corrected by Stiller and Costeff in their upstream project.
  ** The CQL code which allocates position marks (as game move comments) does so through Game::SetMoveComment(), which utilizes the StrAlloc infrastructure.  This memory is released by what passes for Game object destruction.


Stiller notes on heap cleanup implementation:

---Freeing memory---
To free memory in CQL, there is an abstract base class Deleteable (in
deleteable.h and deleteable.cpp) which has an no argument constructor and a
virtual destructor. Deleteable keeps track of all its instances in a
set<Deleteable*>. The constructor adds "this" to that set. The destructor
just verifies that "this" is part of the instance pool and that the
destructor was called during the cleanup phase. The cleanup phase just
deletes all instances in the instance pool, and the deletes the instance
pool itself.

To make a CQL class able to be free, make it inherit from Deleteable. See
e.g. "node.h" where the base class of all the nodes, "class Node", just
inherits from Deleteable. This makes all subclasses of Node deleteable.
(Note that CqlNode is the root of the parse tree, but the root of the
inheritance hierarchy is Node).

Now to delete all instances of the class (and of all other classes that
inherit from Deleteable), just call Deleteable::deleteable_cleanup();

The main restriction here is that no instance of Deleteable should be
deleted other than through the Deleteable::deleteable_cleanup() interface.
In particular, no such instance can be stack allocated.

This limitation means a few classes are not Deleteable, like Transform.
This should not cause much garbage.

Character strings are not freed. This will cause a memory leak roughly
equal to the total number of characters in all the CQL files parsed during
the current invocation. I did not see this as a significant limitation
because CQL files are typically short, but if a user does decided to run
Cql on ten million different CQL files in the same process, there might be
a problem.

To free character string is straightforward. Virtually all character
strings are allocated in util::copy and this method can just keep track of
its allocations. Alternatively, and more cleanly, std::string should be
used rather than char* . I did change GameSortInfo , a value object used to
sort the output pgn strings corresponding to a game with a key, from using
char* to std:string . But this does not affect you I believe.

Note that all Deleteable objects are actually created before the matching
of the CqlNode against the pgn file is begun. They are created by the CQL
parser/lexer .

The main thing to remember is that cql_initialize() must be called before a
new CQL file is read.

Structurally, to faciliate this, I removed most of the static local
variables in methods. I put the globals, including some cache variables
that used to be static variables, in cqlglobals.h and cqlglobals.cpp .


UPGRADES:

Because the common CQL ancestor for the CQL and ScidvsPC projects is not in the ScidvsPC SVN repository, CQL upgrades (merges) are accomplished out-of-tree.  Merges take place in a Git context with Git's “reuse recorded resolution” feature enabled, which "allows you to ask Git to remember how you’ve resolved a hunk conflict so that the next time it sees the same conflict, Git can automatically resolve it for you."  That feature saves a significant amount of work on merges subsequent to the first.

The current build and version can be found in version.cpp.

The common ancestor is:  CqlBuild="8.51" CqlVersion="5.1".


TESTING:

The CQL source distribution includes a set of scripts in the examples directory which can be used for testing the correctness of the port.  There is a bash script in the src/cql/scripts directory of the ScidvsPC tree which can be tweaked to assist in side-by-side comparisons of the search results given by the original cql command line executable vs. the ported scql command line executable.  The ported scql can be built from the root Makefile: make scql.

In testing the integrated engine, the only exceptions encountered have been triggered by either mal-formed syntax or by null moves in the game record. Since Stiller and Costeff have recently implemented the handling of null moves in the CQL engine, it has proved difficult to trigger exceptions in the match phase of the search.  We have therefore resorted to forcing a contrived exception in the null move handling section of the engine for testing purposes.  To enable this contrived exception, grep the CQL source for 'NULL MOVE CONTRIVED EXCEPTION', uncomment the forced failed assert and rebuild tkscid.  Then conduct a query on a game with a null move to trigger the exception.


