Programming Snapshot: Implementing Fast Queries for Local Files in Go - Printable Version +- Sick Gaming (https://www.sickgaming.net) +-- Forum: Computers (https://www.sickgaming.net/forum-86.html) +--- Forum: Linux, FreeBSD, and Unix types (https://www.sickgaming.net/forum-88.html) +--- Thread: Programming Snapshot: Implementing Fast Queries for Local Files in Go (/thread-86707.html) |
Programming Snapshot: Implementing Fast Queries for Local Files in Go - xSicKxBot - 10-03-2018 Programming Snapshot: Implementing Fast Queries for Local Files in Go <div style="margin: 5px 5% 10px 5%;"><img src="http://www.sickgaming.net/blog/wp-content/uploads/2018/10/programming-snapshot-implementing-fast-queries-for-local-files-in-go.png" width="346" height="92" title="" alt="" /></div><div><div class="article_intro"> <p>To find files quickly in the deeply nested subdirectories of his home directory, Mike whips up a Go program to index file metadata in an SQLite database.</p> </div> <div class="article_body"> <p>…the GitHub Codesearch <a class="info" href="http://www.linuxpromagazine.com/Online/Features/Programming-Snapshot-Go/(offset)/3#article_i1">[1]</a> project, with its indexer built in Go, at least lets you browse locally available repositories, index them, and then search for code snippets in a flash. Its author, Russ Cox, then an intern at Google, explained later how the search works <a class="info" href="http://www.linuxpromagazine.com/Online/Features/Programming-Snapshot-Go/(offset)/3#article_i2">[2]</a>.</p> </div> <p>How about using a similar method to create an index of files below a start directory to perform quick queries such as: “Which files have recently been modified?” “Which are the biggest wasters of space?” Or “Which file names match the following pattern?”</p> <p>Unix filesystems store metadata in inodes, which reside in flattened structures on disk that cause database-style queries to run at a snail’s pace. To take a look at a file’s metadata, run the <code>stat</code>command on it and take a look at the file size and timestamps, such as the time of the last modification (<a class="figure" href="http://www.linuxpromagazine.com/Online/Features/Programming-Snapshot-Go#article_f2">Figure 2</a>).</p> <div class="object-center"> <div class="imagecenter"><img alt="" src="http://www.sickgaming.net/blog/wp-content/uploads/2018/10/programming-snapshot-implementing-fast-queries-for-local-files-in-go.png" /></p> <p>Figure 2: Inode metadata of a file, here determined by stat, can be used to build an index.</p> </div> </div> <p>Newer filesystems like ZFS or Btrfs take a more database-like approach in the way they organize the files they contain but do not go far enough to be able to support meaningful queries from userspace.</p> <h4>Fast Forward Instead of Pause</h4> <p>For example, if you want to find all files over 100MB on the disk, you can do this with a <code>find</code> call like:</p> <pre class="auto"> find / -type f -size +100M</pre> <p>If you are running the search on a traditional hard disk, take a coffee break. Even on a fast SSD, you need to prepare yourself for long search times in the minute range. The reason for this is that the data is scattered in a query-unfriendly way across the sectors of the disk.</p> <p>Read more at <a href="http://www.linuxpromagazine.com/Online/Features/Programming-Snapshot-Go">Linux Pro Magazine</a></p> </div> |