I've never corrupted an SQLite database _file_, but an FTS5 index is remarkably easy to mess up if you don't read and understand the docs to the letter and which parts of them apply to your exact use case. When you do a thing wrong with FTS5, you either get a generic ambiguous error, or silent corruption of the index. It's not a black-box extension to you can just enable with no understanding of how it works under the hood, I learned that the hard way.
nok22kon 12 hours ago [-]
> Multiple copies of SQLite linked into the same application
I had recent SQLite corruption, and I suspect it was this - I was accessing an SQLite database from the same python process using both the builtin sqlite3 package, and also the third party apsw library
webprofusion 12 hours ago [-]
Interesting that it doesn't specifically call out Anti-Virus scanning (which does occasionally result in at least one of these scenarios). I've seen many SQLite database become corrupted and the best you can do is have a backup.
rogerbinns 2 hours ago [-]
SQLite has code to work around anti-virus scanning, by doing retries while the anti-virus tool is processing an operation. I've not seen anti-virus tools result in corruption on the SQLite forum.
BiteCode_dev 7 hours ago [-]
I had one machine on my entire client's network that could never download a LibreOffice document without it being corrupted.
Turns out it was kaspersky intercepting network calls, and deciding it was a very dangerous piece of file, and it would truncate it completely silently.
After wasting a non-billable afternoon on it, I just disabled the antivirus out of desperation and figured it out.
The solution was to generate a self signed certificate and TLS the connection and prevent the bugger to MITM us.
Since this day, even on a local network with behind a proxies and using a VPN, I still use https for all the services if I'm allowed.
wolfi1 10 hours ago [-]
if processes lock the file shouldn't AV refrain from reading or even writing it?
rcxdude 8 hours ago [-]
Nope, AV hooks into the filesystem layer (the NT kernel has 'filesystem filters' for this) and intercepts all reads and writes on the system.
felix-the-cat 4 hours ago [-]
I just finished building an open-source project using SQLite, and this makes me a little nervous. Is SQLite generally considered a reliable system? I've only had one experience with the DB becoming corrupted and that was because I used the Task Manager to force kill the process while it was running under a heavy load, and I felt that was understandable.
mattrighetti 3 hours ago [-]
> Is SQLite generally considered a reliable system?
It is arguably the most battle-tested piece of software on the planet. It's deployed basically on every single piece of tech out there.
jurip 3 hours ago [-]
And in many cases, multiple times. I wouldn't be surprised if the number of SQLite DBs on my phone was in the hundreds and the number of separately compiled copies of the library is in double digits.
felix-the-cat 3 hours ago [-]
Thanks, appreciate it!
listeria 3 hours ago [-]
killing a process which is using SQLite shouldn't corrupt the database, same as pulling the plug.
rnio 4 hours ago [-]
[flagged]
seethishat 5 hours ago [-]
To make the corruption more efficient, store the file on a RAID.
earth_tattoo 4 hours ago [-]
Why do people shit on raid? I have been running RAID servers for over a decade with no issues, including a personal server with 5x16tb drives.
Spooky23 4 hours ago [-]
Usually because people think it’s a backup.
seethishat 4 hours ago [-]
It's a joke. By using RAID, more bits will be corrupted, thus more efficient.
toxik 6 hours ago [-]
> One example of this occurred circa 2013-08-30 on the canonical repository for the Fossil DVCS. In that event, file descriptor 2 (standard error) was being erroneously closed (by stunnel, we suspect) prior to sqlite3_open_v2() so that the file descriptor used for the repository database file was 2. Later, an application bug caused an assert() statement to emit an error message by invoking write(2,...). But since file descriptor 2 was now connected to a database file, the error message overwrote part of the database.
Kind of crazy that this is an issue in modern operating systems. There are just so many ways to avoid this obvious footgun of an API design. stdin/out/err should be reserved file descriptors. In fact, why reuse file descriptors at all? Just count up.
> The open() function shall return a file descriptor for the named file that is the lowest file descriptor not currently open for that process.
duped 3 hours ago [-]
Firstly, standards compliance is bad when the standard is bad.
Secondly nothing precludes defining '3' as the lowest file descriptor for open(), since stdin/stdout/stderr cannot be opened (they are opened before the process starts).
nb4 open("/dev/stdin"), that's not what I mean
11 hours ago [-]
linzhangrun 4 days ago [-]
Interesting title for official SQLite documentation :)
unfocso 3 days ago [-]
The whole sqlite documentation is full of gold gems and other curious documents mostly to appease bureocrats and big companies. It doubles as a fun read other than being incredibly useful.
It’s impressive. To admit fallibility is to be honest. It represents confidence.
search_facility 9 hours ago [-]
Also represents good test coverage
owenmarshall 2 hours ago [-]
It's almost a HN meme at this point (and you might be referencing it ;-)). But for anyone unaware, SQLite's test coverage is _outstanding_ and well worth reading about[0].
So is the official pronunciation of SQLite spelling out the letters then? I’d expect “a” not “an”…
cyberax 12 hours ago [-]
I have never heard it pronounced in any other way than "s-q-lite".
lgas 3 hours ago [-]
Odd. I've never heard it pronounced in any other way than "Sequel-lite".
10 hours ago [-]
wat10000 5 hours ago [-]
Officially it’s S-Q-L-ite, as if it were a mineral.
pidgeon_lover 4 hours ago [-]
Not Sequel-lite?
sigzero 1 hours ago [-]
Nope. A quote from Dr. Hipp:
"How do I pronounce the name of the product? I say S-Q-L-ite, like a mineral."
gumby271 31 minutes ago [-]
This feels like the most ridiculous option and one I've never heard in the wild. Sequel-ite, S-Q-lite, sure. S-Q-L-ite, nah I'm not buying that. Kinda like the inventor of GIF insisting on pronouncing it wrong (I kid)
wat10000 3 minutes ago [-]
He's also said he doesn't care how other people pronounce it, so thankfully you're safe.
red1oon 4 days ago [-]
Article date is Jan 2022.
This changes when SQLite runs as WASM in a browser — a context that only became properly viable with OPFS synchronous access handles in mid-2022.
andrewl 3 days ago [-]
The January 6, 2022 date at the bottom of the page is not the date the page was last updated. It is the date problem 8.9 (Boundary value error in the secondary journals used by nested transactions) directly above it was fixed. The date at the very bottom of the screen in the middle says the page itself was last updated on 2026-04-13.
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=41846796 - Oct 2024 (1 comment)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=33503555 - Nov 2022 (1 comment)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=31214131 - April 2022 (139 comments)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=16579986 - March 2018 (10 comments)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=6502229 - Oct 2013 (63 comments)
I had recent SQLite corruption, and I suspect it was this - I was accessing an SQLite database from the same python process using both the builtin sqlite3 package, and also the third party apsw library
Turns out it was kaspersky intercepting network calls, and deciding it was a very dangerous piece of file, and it would truncate it completely silently.
After wasting a non-billable afternoon on it, I just disabled the antivirus out of desperation and figured it out.
The solution was to generate a self signed certificate and TLS the connection and prevent the bugger to MITM us.
Since this day, even on a local network with behind a proxies and using a VPN, I still use https for all the services if I'm allowed.
It is arguably the most battle-tested piece of software on the planet. It's deployed basically on every single piece of tech out there.
Kind of crazy that this is an issue in modern operating systems. There are just so many ways to avoid this obvious footgun of an API design. stdin/out/err should be reserved file descriptors. In fact, why reuse file descriptors at all? Just count up.
> https://pubs.opengroup.org/onlinepubs/000095399/functions/op...
> The open() function shall return a file descriptor for the named file that is the lowest file descriptor not currently open for that process.
Secondly nothing precludes defining '3' as the lowest file descriptor for open(), since stdin/stdout/stderr cannot be opened (they are opened before the process starts).
nb4 open("/dev/stdin"), that's not what I mean
See, for example: "Defense about the dark arts" (https://sqlite.org/security.Html) and "Why in C?" saying "Because C is best."
[0] https://www.sqlite.org/testing.html
"How do I pronounce the name of the product? I say S-Q-L-ite, like a mineral."