[Question]: Gemma Issue with LlamaIndex "query_engine = SQLAutoVectorQueryEngine(sql_tool, vector_tool )"

#57
by Alwiin - opened

Question
Hi everyone,

I need assistance with my implementation issue and would appreciate it if someone could help me.

I have tried to implement the tutorial of retrieving data from SQL and Wikipedia. The code works without any issues with GPT models.

When using LLM = Gemma and Embedding model = local:BAAI/bge-small-en-v1.5 in my Jupyter notebook, the model can generate appropriate responses for different questions by looking at individual SQL/Wikipedia data, regardless of whether I use the SQL query engine or the vector index query engine separately.

However, when I use query_engine = SQLAutoVectorQueryEngine(sql_tool, vector_tool ) to retrieve data from both data sources for the query of response = query_engine.query("Tell me about the arts and culture of the city with the highest population") ,
I get the following error:

=====================================================================
JSONDecodeError Traceback (most recent call last)
File c:\Users.conda\envs\llamaindex_py3.10\lib\site-packages\llama_index\core\output_parsers\selection.py:75, in SelectionOutputParser.parse(self, output)
74 try:
---> 75 json_obj = json.loads(json_string)
76 except json.JSONDecodeError as e_json:

File c:\Users.conda\envs\llamaindex_py3.10\lib\json_init_.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
343 if (cls is None and object_hook is None and
344 parse_int is None and parse_float is None and
345 parse_constant is None and object_pairs_hook is None and not kw):
--> 346 return _default_decoder.decode(s)
347 if cls is None:

File c:\Users.conda\envs\llamaindex_py3.10\lib\json\decoder.py:340, in JSONDecoder.decode(self, s, _w)
339 if end != len(s):
--> 340 raise JSONDecodeError("Extra data", s, end)
341 return obj

JSONDecodeError: Extra data: line 7 column 1 (char 210)

During the handling of the above exception, another exception occurred:

ScannerError Traceback (most recent call last)
File c:\Users.conda\envs\llamaindex_py3.10\lib\site-packages\llama_index\core\output_parsers\selection.py:84, in SelectionOutputParser.parse(self, output)
...
{
"choice": 2,
"reason": "The question is about the arts and culture of a city, so the most relevant choice is (2) Useful for answering semantic questions about different cities."
}
]

Google org

Hi @Alwiin , Sorry for late response Could you please try to use the output_parser parameter to specify an output parser that can handle both JSON and non-JSON data.
Try this below code :
query_engine = SQLAutoVectorQueryEngine(sql_tool, vector_tool, output_parser=lambda output: json.loads(output["json"]))

Sign up or log in to comment