sql - Solr: Properly qualifying PKs of records in delta import -
there similar questions/answers on internet, still can't find solution a problem related way solr combines query statements when doing delta import in combination current database structure (which can't change).
problem
at first, getting java.lang.illegalargumentexception: deltaquery has no column resolve declared primary key pk='id'
following data config:
<dataconfig> <datasource type="jdbcdatasource" driver="com.mysql.jdbc.driver" url="..." user="..." password="..." readonly="true" batchsize="2000" usecompression="true" /> <document> <document> <entity name="job" pk="id" query="select j.id, j.employer_id, j.title, j.description, j.created_on, e.name employer, s.address source_website_address jobs j left join employers e on e.employer_id = j.employer_id left join source_websites s on s.id = j.source_website_id" deltaquery="select j.id jobs j created_on > '${dataimporter.last_index_time}'" > <field column="id" name="id" /> <field column="employer_id" name="employer_id" /> <field column="title" name="title" /> <field column="description" name="description" /> <field column="employer" name="employer" /> <field column="source_website_address" name="source_website_address" /> </document> </dataconfig>
attempted solutions
explicit entity primary key
i tried a possible solution adding pk="id"
entity
element, resulted in
exception while processing: job document : solrinputdocument(fields: []):org.apache.solr.handler.dataimport.dataimporthandlerexception: unable execute query: select j.id, j.employer_id, j.title, j.description, j.created_on, e.name employer, s.address source_website_address jobs j left join employers e on e.employer_id = j.employer_id left join source_websites s on s.id = j.source_website_id id = 3355 @ org.apache.solr.handler.dataimport.dataimporthandlerexception.wrapandthrow(dataimporthandlerexception.java:71) ... caused by: com.mysql.jdbc.exceptions.jdbc4.mysqlintegrityconstraintviolationexception: column 'id' in clause ambiguous ...
column aliasing
the column id
ambiguous because table jobs
source_websites
contains column name. reason, tried aliasing primary-key columns avoid problem:
... <entity name="job" pk="entity_id" query="select j.id entity_id, ... deltaquery="select j.id entity_id ...
however, setting columns such, get
... caused by: org.apache.solr.handler.dataimport.dataimporthandlerexception: unable execute query: select j.id entity_id, j.employer_id, j.title, j.description, j.created_on, e.name employer, s.address source_website_address jobs j left join employers e on e.employer_id = j.employer_id left join source_websites s on s.id = j.source_website_id entity_id = 199983 ... caused by: com.mysql.jdbc.exceptions.jdbc4.mysqlsyntaxerrorexception: unknown column 'entity_id' in 'where clause' ...
explicit delta import query
finally, tried explicitly defining deltaimportquery
as
deltaimportquery="select j.id entity_id, j.employer_id, j.title, j.description, j.created_on, e.name employer, s.address source_website_address jobs j left join employers e on e.employer_id = j.employer_id left join source_websites s on s.id = j.source_website_id id='${dataimporter.delta.entity_id}'"
however, results in unknown column 'entity_id' in 'where clause'
.
how suss solr creating unambiguous select query while executing delta import in case?-- seems i've tried except change database schema, really, not want do.
thanks help.
Comments
Post a Comment