Skip to content

bpo-38185: Fixed case-insensitive string comparison in sqlite3.Row indexing. #16190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 18 additions & 7 deletions Lib/sqlite3/test/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,16 +98,14 @@ def CheckCustomFactory(self):

def CheckSqliteRowIndex(self):
self.con.row_factory = sqlite.Row
row = self.con.execute("select 1 as a, 2 as b").fetchone()
row = self.con.execute("select 1 as a_1, 2 as b").fetchone()
self.assertIsInstance(row, sqlite.Row)

col1, col2 = row["a"], row["b"]
self.assertEqual(col1, 1, "by name: wrong result for column 'a'")
self.assertEqual(col2, 2, "by name: wrong result for column 'a'")
self.assertEqual(row["a_1"], 1, "by name: wrong result for column 'a_1'")
self.assertEqual(row["b"], 2, "by name: wrong result for column 'b'")

col1, col2 = row["A"], row["B"]
self.assertEqual(col1, 1, "by name: wrong result for column 'A'")
self.assertEqual(col2, 2, "by name: wrong result for column 'B'")
self.assertEqual(row["A_1"], 1, "by name: wrong result for column 'A_1'")
self.assertEqual(row["B"], 2, "by name: wrong result for column 'B'")

self.assertEqual(row[0], 1, "by index: wrong result for column 0")
self.assertEqual(row[1], 2, "by index: wrong result for column 1")
Expand All @@ -116,13 +114,26 @@ def CheckSqliteRowIndex(self):

with self.assertRaises(IndexError):
row['c']
with self.assertRaises(IndexError):
row['a_\x11']
with self.assertRaises(IndexError):
row['a\x7f1']
with self.assertRaises(IndexError):
row[2]
with self.assertRaises(IndexError):
row[-3]
with self.assertRaises(IndexError):
row[2**1000]

def CheckSqliteRowIndexUnicode(self):
self.con.row_factory = sqlite.Row
row = self.con.execute("select 1 as \xff").fetchone()
self.assertEqual(row["\xff"], 1)
with self.assertRaises(IndexError):
row['\u0178']
with self.assertRaises(IndexError):
row['\xdf']

def CheckSqliteRowSlice(self):
# A sqlite.Row can be sliced like a list.
self.con.row_factory = sqlite.Row
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fixed case-insensitive string comparison in :class:`sqlite3.Row` indexing.
62 changes: 31 additions & 31 deletions Modules/_sqlite/row.c
Original file line number Diff line number Diff line change
Expand Up @@ -76,16 +76,38 @@ PyObject* pysqlite_row_item(pysqlite_Row* self, Py_ssize_t idx)
return item;
}

static int
equal_ignore_case(PyObject *left, PyObject *right)
{
int eq = PyObject_RichCompareBool(left, right, Py_EQ);
if (eq) { /* equal or error */
return eq;
}
if (!PyUnicode_Check(left) || !PyUnicode_Check(right)) {
return 0;
}
if (!PyUnicode_IS_ASCII(left) || !PyUnicode_IS_ASCII(right)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looks like sqlite supports non-ascii characters as identifiers: https://github.com/mackyle/sqlite/blob/a419afd73a544e30df878db55f7faa17790c01bd/ext/misc/normalize.c#L174-L177

And as shown in your example on bpo, should we maybe just simplify this code and do a casefold comparison?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial version used casefold. But then I made some experiments and found that Sqlite itself ignores case only for ASCII letters. It was just my guess, and thank you for confirming it with the reference to sources.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah okay, I thought the case-insensitivity for the Row object was for the convenience of the Python API, not trying to mirror any particular part of sqlite itself. Which part of sqlite ignores case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part of sqlite ignores case?

I guess any part which compare identifiers. For example, the following code passes:

create table test(a text)
select A from test

and the following fails:

create table test(a text, A text)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah gotcha, thanks for taking the time to explain Serhiy.

return 0;
}

Py_ssize_t len = PyUnicode_GET_LENGTH(left);
if (PyUnicode_GET_LENGTH(right) != len) {
return 0;
}
const Py_UCS1 *p1 = PyUnicode_1BYTE_DATA(left);
const Py_UCS1 *p2 = PyUnicode_1BYTE_DATA(right);
for (; len; len--, p1++, p2++) {
if (Py_TOLOWER(*p1) != Py_TOLOWER(*p2)) {
return 0;
}
}
return 1;
}

PyObject* pysqlite_row_subscript(pysqlite_Row* self, PyObject* idx)
{
Py_ssize_t _idx;
const char *key;
Py_ssize_t nitems, i;
const char *compare_key;

const char *p1;
const char *p2;

PyObject* item;

if (PyLong_Check(idx)) {
Expand All @@ -98,44 +120,22 @@ PyObject* pysqlite_row_subscript(pysqlite_Row* self, PyObject* idx)
Py_XINCREF(item);
return item;
} else if (PyUnicode_Check(idx)) {
key = PyUnicode_AsUTF8(idx);
if (key == NULL)
return NULL;

nitems = PyTuple_Size(self->description);

for (i = 0; i < nitems; i++) {
PyObject *obj;
obj = PyTuple_GET_ITEM(self->description, i);
obj = PyTuple_GET_ITEM(obj, 0);
compare_key = PyUnicode_AsUTF8(obj);
if (!compare_key) {
int eq = equal_ignore_case(idx, obj);
if (eq < 0) {
return NULL;
}

p1 = key;
p2 = compare_key;

while (1) {
if ((*p1 == (char)0) || (*p2 == (char)0)) {
break;
}

if ((*p1 | 0x20) != (*p2 | 0x20)) {
break;
}

p1++;
p2++;
}

if ((*p1 == (char)0) && (*p2 == (char)0)) {
if (eq) {
/* found item */
item = PyTuple_GetItem(self->data, i);
Py_INCREF(item);
return item;
}

}

PyErr_SetString(PyExc_IndexError, "No item with that key");
Expand Down