LWN.net's Journal
[Most Recent Entries]
[Calendar View]
Friday, April 15th, 2016
Time |
Event |
3:10p |
Friday's security advisories Arch Linux has updated lhasa
(code execution).
Debian has updated chromium-browser (multiple vulnerabilities).
Fedora has updated cryptopp (F24:
information disclosure), libtasn1 (F24:
denial of service), poppler (F23: code
execution), qpid-proton (F23: TLS to
plaintext downgrade), and samba (F24:
multiple vulnerabilities).
openSUSE has updated java-1_7_0-openjdk (13.1: sandbox bypass). | 5:22p |
Costa: Designing a Userspace Disk I/O Scheduler for Modern Datastores: the Scylla example (Part 1) Over at the Scylla blog, Glauber Costa looks at why a high-performance datastore application might want to do its own I/O scheduling. " If one is using a threaded approach for managing I/O, a thread can be assigned to a different priority group by tools such as ionice. However, ionice only allows us to choose between general concepts like real-time, best-effort and idle. And while Linux will try to preserve fairness among the different actors, that doesn’t allow any fine tuning to take place. Dividing bandwidth among users is a common task in network processing, but it is usually not possible with disk I/O without resorting to infrastructure like cgroups.
More importantly, modern designs like the Seastar framework used by Scylla to build its infrastructure may stay away from threads in favor of a thread-per-core design in the search for better scalability. In the light of these considerations, can a userspace application like Scylla somehow guarantee that all actors are served according to the priorities we would want them to obey?" | 8:56p |
Brauch: Processing scientific data in Python and numpy, but doing it fast On his blog, Sven Brauch has some suggestions on how to use NumPy to process scientific data and how to avoid some pitfalls that will ruin its performance. " In general, copying data is cheap. But if your program simulates 25 million particles, each having a float64 location in 3d, you already have 8*3*25e6 = 600 MB of data. Thus, if you write r = r + v*dt, you will copy 1.2 GB of data around in memory: once 600 MB to calculate v*dt, and again to calculate r+(v*dt), and only then the result is written back to r. This can really become a major bottleneck if you aren’t careful. Fortunately, it is usually easy to circumvent; instead of writing r = r+dv, write r += dv. Instead of a = 3*a + b, write a *= 3; a+= b. This avoids the copying completely. For calculating v*dt and adding it to r, the situation is a bit more tricky; one good idea is to just have the unit of v be such that you don’t need to multiply by dt. If that is not possible, it might even be worth it to keep a copy of v which is multiplied by dt already, and update that whenever you update v. This is advantageous if only few v values change per step of your simulation.
I would not recommend writing it like this everywhere though, it’s often not worth the loss in readability; just for really large arrays and when the code is executed frequently." |
|