Single master can typically handle a few dozen slaves
If you have a lot of slaves and want to do filtering for all, maybe use blackhole engine to do filtering in one place. Especially if running blackhole slave on same server to limit network bandwidth.
Replicate over WAN – slave_net_timeout, master_connect_retry, slave_compressed_protocol
UDFs allow you to run functions in a wide range of locations on servers other than the database server
Synch and asynch jobs
gman_sum() provides simple example of map-reduce with gearman
– Asynch queues, a.k.a., background tasks
e.g., email notifications (avoid blocking on sendmail), full text indexing, stat counters, etc.
Can enable you to do batch operations, when you don’t want to block on a large batch
Maybe use trigger to call a UDF that does a background task. Allows you to implement much more complex triggers with external side effects (e.g., sending notifications).
Putting Gearman workers between app and MySQL can allow you to parallelize queries (scatter-gather).
Persistent queues – if job server crashes, it can replay the queue on startup or you start another job server and point it at the queue
Queues can be stored in Drizzle, MySQL, PostgresQL, SQLite and even in Memcached.
update/delete by full scan is sequential rw. By index scan is random. So, full scan can be better in some cases.
In 5.1 can disable next key locking. use binlog-format=row and transaction-isolation=read-committed. But read committed can perform worse with high concurrency.
Deadlock caused by indexing
Sometimes single query update or delete has to update secondary index as well as PK table in InnoDB.
If 1 statement accesses and modifies sec index before accessing table records while another statement modifies same table record and needs to update same row in secondary index, you can get deadlock.
Covering index with range, limit, count
Range queries over large ranges often slow, especially if looking in secondary index before using range on table records. Can result in many random reads. Covering index can improve perf by eliminating random reads.
Covering index can also help a lot with LIMIT (especially when skipping) if the query would otherwise have to randomly access table records to apply other where clause conditions.
Similarly, covering index can help with COUNT, since you don’t have to randomly access table records.
Covering index with blob
Assuming you have a table with a blob that is rarely selected (e.g., < 10% of time). To avoid blob being loaded into buffer pool, typical optimization is to move blob into another table with 1:1 relationship, though this breaks normalization.
For his tests, performance with 2 tables is better until percentage of queries requesting blob (thus requiring two SELECTs) get up to around 45%.
Instead, can use covering index to include all desired columns (except the blob, of course) for most queries. To get good perf, initial column should be chronologically order, e.g., auto inc or a timestamp.
May have to use FORCE INDEX to get optimizer to use your index.
Sorting, indexing and query execution plans
If query does ORDER BY on a secondary index that is used for the query, can skip filesort, which is O(nlogn).
If not, optimizer chooses btw full scan and using index for sort vs. index scan and a filesort.
Index merge can be used to filter records, but not to sort.
Understanding the Role of IO as Bottleneck – slides
7200 RPM drive can generally do about 100 IOPS.
HDD disk seek about 10 ms. Reading 1 MB sequentially from HDD is about 20 ms.
When you do a read on InnoDB, rows are loaded into buffer pool.
Writes to InnoDB go to buffer pool, but also are flushed to log file. Sequential IO to log file is fast. Buffer pool pages (16 KB/page) are marked as dirty until flushed to tablespace.
Writes to tablespace are sorted by innodb to be sequential as much as possible.
iostat is more useful than vmstat for IO analysis
If svctm large (maybe more than a few milliseconds), may have large competing IO requests, e.g., a busy shared SAN.
Queue size tells you how much data could be lost if server crashes. Wait time indicates how long ops may be blocking and also how much risk of lost data from power loss with no battery back up (or non-catastropic disk crash).
Indexes and archiving old data can reduce your working set. If your server is read-heavy, lots of RAM can really help.
RAID 5 good for reads, but worse for writes. RAID 10 good for both.
RAID controller with BBWC can reorder some random writes to sequential writes without risk of a power loss losing data.
Fusion-IO card can theoretically do up to 100,000 IOPS. Another presenter suggested that InnoDB is currently limited to around 15,000 on these cards.
Often cheaper to buy more memory, though that doesn’t help as much with writes.
Write options – mount files sytem noatime, compress blobs, reduce number of filesorts on disk.
If RAID with no BBWC, use innodb_flush_log_at_trx_commit of 2 (but be aware of possible data loss) or group statements into short transactions.
Referred to Facebook talk – they said maximum desirable transaction length for their workload and servers is about 4 seconds.
Safer to purge binary logs by using PURGE BINARY LOG {TO, BEFORE}, since it updates index file.
If you use expire_log_days, removal happens at server restart or log flush (which happens during a file rotate).
RESET MASTER will delete everything and create new binlog file. Disconnect any slaves before doing this.
Common header about 19 bytes. 4 byte timestamp, type, server id, length, file position, flags. File position is the end of the event.
SYSDATE() is dangerous, because it actually runs on the slave, so the generated timestamp will be different than on the master.
Context events (user defined vars, rand seed, auto_increment, last_insert_id()) are grouped with the query into a binlog event.
Since replication user can read binary logs, it can read all changes. Can be security issue.
For setting passwords, better to set passwords by setting them into a user variable as a hash and then setting hash directly as the password. Only the already hashed value in the SET statement is replicated.
Statements in transaction go into a cache. It is flushed to binary log on commit. It is emptied if a transaction with only transactional statements is rolled back. Complicated if you mix in MyISAM changes.
Transaction cache spills to a file if it gets too big. This is one of the reasons large transactions can cause bad performance.
DDL statements like CREATE and ALTER cause implicit commits. Changes to mysql databse and LOAD DATA INFILE also are implicitly committed.