sql - Non Equality Join on Hive -
i trying emulate sql query on hive, grab data 2 tables do not have common field.
the 2 tables geometric (geohive) , 1 contains bunch of points (one each record) , other 1 grid (one cell each record). objective count how many points fit inside each cell.
the join condition between 2 tables geometric function itself. how in postgis:
select g.geom, count(t.geom) cnt grid g, points t st_contains(g.geom,t.geom) group g.geom
hive not accept multiple select, have perform join. thought work:
select count(1) grid join points st_contains(grid.geom,points.geom) group grid.geom;
, hive silently ignores condition:
"hive not support join conditions not equality conditions difficult express such conditions map/reduce job"
anyone has ideas of how refactor such query on hive?
i'll answer own question, state query attempt correct: failing other reasons, have nothing syntax.
select count(grid.geom) ptcnt grid join points st_contains(grid.geom,points.geom) group grid.geom;
this equivalent writing:
select count(grid.geom) ptcnt grid join points on(true) st_contains(grid.geom,points.geom) group grid.geom;
hive not in fact support non-equality joins, set join condition "true", , filter results on "where" clause.
apparently pretty standard behaviour spatial joins, can read here.
Comments
Post a Comment