WL#7221: Geometry Set Operations

Status: Complete

Implement MySQL GIS relation check functions using boost geometry.
Gis features are more and more powerful, important and pervasive in
RDBMS products, and MySQL GIS needs extensive improvements
in terms of functionality, reliablity and performance.
We want to achive this by replacing the home grown gis computation core
with boost geometry, just like PostGIS is using GEOS, so that
we have all features boost geometry can provide. Boost geometry
is more powerful and robust and keeps growing, we will grow with it.
F-0: All functions in this WL are *new* to end users, because the documentation
says they don't exist before. However, they are tested in mysql public and
private test suites. Thus we need to cover usage of these functions in MySQL
documentation.

F-1: Usage or input data format shall not change, existing SQL query
code shall need no change at all.
F-2: Some type combinations which were not supported before
will be supported, specifically the following are supported now:
intersection(point, point),
intersection(point, multipoint),
intersection(point, linestring),
intersection(point, multilinestring),
intersection(point, polygon),
intersection(point, multipolygon),
intersection(multipoint, point),
intersection(multipoint, multipoint),
intersection(multipoint, linestring),
intersection(multipoint, multilinestring),
intersection(multipoint, polygon),
intersection(multipoint, multipolygon),
intersection(linestring, point),
intersection(linestring, multipoint),
intersection(polygon, point),
intersection(polygon, multipoint),
intersection(multilinestring, point),
intersection(multilinestring, multipoint),
intersection(multipolygon, point),
intersection(multipolygon, multipoint)


union(point, point),
union(point, multipoint),
union(point, linestring),
union(point, multilinestring),
union(point, polygon),
union(point, multipolygon),
union(multipoint, point),
union(multipoint, multipoint),
union(multipoint, linestring),
union(multipoint, multilinestring),
union(multipoint, polygon),
union(multipoint, multipolygon),
union(linestring, point),
union(linestring, multipoint),
union(polygon, point),
union(polygon, multipoint),
union(multilinestring, point),
union(multilinestring, multipoint),
union(multipolygon, point),
union(multipolygon, multipoint)


difference(point, point),
difference(point, multipoint),
difference(point, linestring),
difference(point, multilinestring),
difference(point, polygon),
difference(point, multipolygon),
difference(multipoint, point),
difference(multipoint, multipoint),
difference(multipoint, linestring),
difference(multipoint, multilinestring),
difference(multipoint, polygon),
difference(multipoint, multipolygon),
difference(linestring, point),
difference(linestring, multipoint),
difference(polygon, point),
difference(polygon, multipoint),
difference(polygon, linestring),
difference(polygon, multilinestring),
difference(multilinestring, point),
difference(multilinestring, multipoint),
difference(multipolygon, point),
difference(multipolygon, multipoint),
difference(multipolygon, linestring),
difference(multipolygon, multilinestring)

F-3: Calling a set operation with an unsupported type
combination will return an error ER_GIS_UNSUPPORTED_ARGUMENT to
client, so that the entire query doesn't
produce any result, even if only one row in the table has unsupported geometry.
Unsupported geometries: none.

F-4: Geometry set operation calls of two geometries of different srids is always
not supported, and client will get error ER_GIS_DIFFERENT_SRIDS, and the error
will cause the entire query to fail without producing any result even if only
one row has this error.

F-5: Max SRID number is 4294967295 (0xFFFFFFFF), if given srid larger than this,
the value's lower 4 bytes are used as the srid value (4 bytes unsigned integer).
I-1: No new files
     Here it's supposed that #WL7220 will be completed first, so we won't list
     the need for boost library, it's the same as that of #WL7220.

I-2: No new syntax
     Here it's supposed that #WL7220 will be completed first, so we won't list
     the changes to srid, it's the same as that of #WL7220.

I-3: No new commands
I-4: No new tools.
I-5: No impact on existing functionality.
I-6: Result representation change:

The set operations' result geometry's WKT/WKB may change, but the content
in essence doesn't change. For example, in previous versions we may return a
geometry collection having two polygons, now we return a geometry collection
containing a multipolygon, which contains the same two polygons.
And also, we will always return polygons with its rings' points in
anti-clockwise order. Both are due to how boost geometry works.
It's not straight forward to use boost geometry in mysql gis, because
 1) it doesn't support geometry collection(OGC defined),

 2) Not all 4 standard check functions(intersection, union, difference,
symdifference) are fully supported.

 3) All the geometry types are separate and independent class templates, no
inheritance or polymorphism, and all the algorithm functions are function
templates requring geometry type parameters.

 4) Boost geometry set operations return result in a unique way, we have to do
some further computation to get correct results.

What we can and will do in this worklog is to reimplement the gis set operation
functions by using existing boost geometry features as much as we can, and fall
back to old gis implementation if we can't implement a feature using boost
geometry directly or indirectly. And in the future when boost geometry start to
support more type combinations we can easily use it instead, we won't wait for
boost geometry enhancement/improvement in this work log.


All gis set operation functions have two parameters, given a pair of argument g1
and g2, for a set operation function, if the type combination is directly
supported by boost geometry, simply call the corresponding boost geometry
function template instantiated with g1's and g2's geometry types; 

This is static dispatching --- we have to enumerate all the possible type
combinations of g1 and g2, which are supported by each boost geometry set
operation function template, and instantiate the function template with the
types, and construct the boost geometry objects bg1 and bg2 from g1 and g2, and
then call the function with the arguments bg1 and bg2.

For the unsupported type combinations, try to indirectly use supported geometry
types of boost geometry to do further computing to get the result. Some type
combinations can be computed using the supported type combinations of the same 
set operation, or by calling relation check functions to test relationship of g1
and g2, e.g. multipoint INTERSECTION polygon isn't supported by
boost::geometry::intersection, but we can check whether each point of the
multipoint is in the polygon, and collect the ones that are not to form another
multipoint, which is the result.

Another unique complication of boost geometry set operations is that they need a
3rd template type --- the geometry type to take out set operation result, and
valid options are multipoint, multilinestring and multipolygon. For any set
operation, and for each pair of type combination, there are often more than one
supported result parameter types, we need to make sure whether we need to call
the function multiple times with each supported type of result parameter, and if
so, we need to do the multiple calls and assemble the results into a correrct
final result.

In order to work with existing gis code, we need to keep and use the existing
Geometry class and its children classes in existing mysql gis, and
use these classes in new implementation, and keep the geometry internal format
unchanged, also keep the sql functions GeomFromText/GeomFromWkb's internal result
format unchanged.

Construct boost geometry objects and call boost geometry algo functions in
Item_func_spatial_operation::val_str, reimplement this function using above methods.








Following is a complete list of type combinations for each set operation 
function that are implemented using boost geometry directly or indirectly 
via further computation:

1. Implemented via direct boost geometry set operation calls:

Note that the 3rd geometry type of the rows below is the supported result 
type, for example, we have 

	intersection linestring multipolygon multipoint
	intersection linestring multipolygon multilinestring

this means we can call boost::geometry::intersection with a linestring and 
multipolygon geometry objects as input argument and use a multipoint or 
multilinestring as result argument, to take back the intersection points
or linestrings. In this case it's possible for both results to appear, so we
will need to combine the two results and avoid repetitive points(remove those
that are already on the result line strings from the multipoint result).

For other cases such as 

	difference linestring polygon multipoint
	difference linestring polygon multilinestring

it's impossible for there to be a multipoint result, then we will ignore 
it and simply fetch result with a multilinestring object in the call. And
consequently although BG supports many type combinations for symdifference, the
real useful ones are can be directly called are only four (polygon
combinations), the rest are useless and have to be indirectly computed:
symdifference polygon polygon multipolygon
symdifference polygon multipolygon multipolygon
symdifference multipolygon polygon multipolygon
symdifference multipolygon multipolygon multipolygon


All type combinations directly supported by BG:

union_ polygon polygon multipolygon
union_ polygon multipolygon multipolygon
union_ multipolygon polygon multipolygon
union_ multipolygon multipolygon multipolygon
intersection linestring linestring multipoint
intersection linestring polygon multipoint
intersection linestring polygon multilinestring
intersection linestring multilinestring multipoint
intersection linestring multipolygon multipoint
intersection linestring multipolygon multilinestring
intersection polygon linestring multipoint
intersection polygon linestring multilinestring
intersection polygon polygon multipoint
intersection polygon polygon multipolygon
intersection polygon multilinestring multilinestring
intersection polygon multipolygon multipoint
intersection polygon multipolygon multipolygon
intersection multilinestring linestring multipoint
intersection multilinestring polygon multilinestring
intersection multilinestring multilinestring multipoint
intersection multilinestring multipolygon multilinestring
intersection multipolygon linestring multipoint
intersection multipolygon linestring multilinestring
intersection multipolygon polygon multipoint
intersection multipolygon polygon multipolygon
intersection multipolygon multilinestring multilinestring
intersection multipolygon multipolygon multipoint
intersection multipolygon multipolygon multipolygon
difference linestring linestring multipoint
difference linestring polygon multipoint
difference linestring polygon multilinestring
difference linestring multilinestring multipoint
difference linestring multipolygon multipoint
difference linestring multipolygon multilinestring
difference polygon linestring multipoint
difference polygon polygon multipoint
difference polygon polygon multipolygon
difference polygon multilinestring multilinestring
difference polygon multipolygon multipoint
difference polygon multipolygon multipolygon
difference multilinestring polygon multilinestring
difference multilinestring multilinestring multipoint
difference multilinestring multipolygon multilinestring
difference multipolygon linestring multipoint
difference multipolygon polygon multipoint
difference multipolygon polygon multipolygon
difference multipolygon multipolygon multipoint
difference multipolygon multipolygon multipolygon
sym_difference linestring linestring multipoint
sym_difference linestring polygon multipoint
sym_difference linestring multipolygon multipoint
sym_difference polygon linestring multipoint
sym_difference polygon polygon multipoint
sym_difference polygon polygon multipolygon
sym_difference polygon multilinestring multilinestring
sym_difference polygon multipolygon multipoint
sym_difference polygon multipolygon multipolygon
sym_difference multilinestring polygon multilinestring
sym_difference multilinestring multilinestring multipoint
sym_difference multipolygon linestring multipoint
sym_difference multipolygon polygon multipoint
sym_difference multipolygon polygon multipolygon
sym_difference multipolygon multipolygon multipoint
sym_difference multipolygon multipolygon multipolygon



2. Computed type combinations
Following is a list of computed type combinations for each set operation, which
are NOT directly supported by boost geometry, but can be computed using the type
combinations that boost geometry directly supports. For the symetric
functions(intersection, union, symdifference), code is written without
redundancy --- simply switch the two parameters and call the function again. In
the implementation we often need to do geometry relation checks, we do so by
calling the Item_func_spatial_rel's core function(a static one).

2.1 intersection
	intersection(point, point),
	intersection(point, multipoint),
	intersection(point, linestring),
	intersection(point, multilinestring),
	intersection(point, polygon),
	intersection(point, multipolygon),
	intersection(multipoint, point),
	intersection(multipoint, multipoint),
	intersection(multipoint, linestring),
	intersection(multipoint, multilinestring),
	intersection(multipoint, polygon),
	intersection(multipoint, multipolygon),
	intersection(linestring, point),
	intersection(linestring, multipoint),
	intersection(polygon, point),
	intersection(polygon, multipoint),
	intersection(multilinestring, point),
	intersection(multilinestring, multipoint),
	intersection(multipolygon, point),
	intersection(multipolygon, multipoint)

2.2 union
	union(point, point),
	union(point, multipoint),
	union(point, linestring),
	union(point, multilinestring),
	union(point, polygon),
	union(point, multipolygon),
	union(multipoint, point),
	union(multipoint, multipoint),
	union(multipoint, linestring),
	union(multipoint, multilinestring),
	union(multipoint, polygon),
	union(multipoint, multipolygon),
	union(linestring, point),
	union(linestring, multipoint),
	union(polygon, point),
	union(polygon, multipoint),
	union(multilinestring, point),
	union(multilinestring, multipoint),
	union(multipolygon, point),
	union(multipolygon, multipoint)

2.3 difference

	difference(point, point),
	difference(point, multipoint),
	difference(point, linestring),
	difference(point, multilinestring),
	difference(point, polygon),
	difference(point, multipolygon),
	difference(multipoint, point),
	difference(multipoint, multipoint),
	difference(multipoint, linestring),
	difference(multipoint, multilinestring),
	difference(multipoint, polygon),
	difference(multipoint, multipolygon),
	difference(linestring, point),
	difference(linestring, multipoint),
	difference(polygon, point),
	difference(polygon, multipoint),
	difference(polygon, linestring),
	difference(polygon, multilinestring),
	difference(multilinestring, point),
	difference(multilinestring, multipoint),
	difference(multipolygon, point),
	difference(multipolygon, multipoint),
	difference(multipolygon, linestring),
	difference(multipolygon, multilinestring)

2.4 symdifference
All the type combinations other than the 4 directly supported
polygon/multipolygon combinations. They are implemented using the definition of
symdifference: 
g1 symdifference g2 := (g1 union g2) difference (g1 intersection g2)