Re: Very long deletion time on a 200 GB database

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, everyone.

So it turns out that we're not using 25 GB of virtual memory. (That's what I had been shown yesterday, and it was a bit surprising, to say the least...)

A few statistics that I managed to get from the Windows developers/system administrators:

- The machine has a total of 3.5 GB of RAM
- shared_buffers was set to 256 MB (yes, MB!)
- Virtual memory usage by our process is 3 MB (yes, MB)
- CPU is virtually idle when running the deletes, using about 1% of CPU
- No other processes are accessing the database when we're running the maintenance; there are a total of three server processes, but two are idle.

(I was a bit surprised, to say the least, by the low number on shared_buffers, given that I believe it's one of the first things I told them to increase about 18 months ago.)

As for Tom's point about rules, I know that rules are bad, and I'm not sure why the system is using a rule rather than a trigger. I'll see if I can change that to a trigger, but I have very indirect control over the machines, and every change requires (believe it or not) writing a .NET program that runs my changes, rather than just a textual script that deploys them.

The only foreign keys are from the B table (i.e., the table whose records I want to remove) to other tables. There are no REFERENCES pointing to the B table. That said, I hadn't realized that primary keys and indexes can also delay the DELETE.

For the latest round of testing, I quadrupled shared_buffers to 1 GB, turned off hash joins (as suggested by someone), and also simplified the query (based on everyone's suggestions). In the tests on my own computer (with a somewhat random 1 GB snapshot of the 200 GB database), the simplified query was indeed much faster, so I'm optimistic.

Several people suggested that chunking the deletes might indeed help, which makes me feel a bit better. Unfortunately, given the time that it takes to run the queries, it's hard to figure out the right chunk size. Whoever suggested doing it in time slices had an interesting idea, but I'm not sure if it'll be implementable given our constraints.

Thanks again to everyone for your help.  I'll let you know what happens...

Reuven

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux