{"id":51988,"date":"2018-10-02T09:25:00","date_gmt":"2018-10-02T09:25:00","guid":{"rendered":"http:\/\/www.sickgaming.net\/blog\/2018\/10\/02\/programming-snapshot-implementing-fast-queries-for-local-files-in-go\/"},"modified":"2018-10-02T09:25:00","modified_gmt":"2018-10-02T09:25:00","slug":"programming-snapshot-implementing-fast-queries-for-local-files-in-go","status":"publish","type":"post","link":"https:\/\/sickgaming.net\/blog\/2018\/10\/02\/programming-snapshot-implementing-fast-queries-for-local-files-in-go\/","title":{"rendered":"Programming Snapshot: Implementing Fast Queries for Local Files in Go"},"content":{"rendered":"<div class=\"article_intro\">\n<p>To find files quickly in the deeply nested subdirectories of his home directory, Mike whips up a Go program to index file metadata in an SQLite database.<\/p>\n<\/div>\n<div class=\"article_body\">\n<p>&#8230;the GitHub Codesearch\u00a0<a class=\"info\" href=\"http:\/\/www.linuxpromagazine.com\/Online\/Features\/Programming-Snapshot-Go\/(offset)\/3#article_i1\">[1]<\/a>\u00a0project, with its indexer built in Go, at least lets you browse locally available repositories, index them, and then search for code snippets in a flash. Its author, Russ Cox, then an intern at Google, explained later how the search works\u00a0<a class=\"info\" href=\"http:\/\/www.linuxpromagazine.com\/Online\/Features\/Programming-Snapshot-Go\/(offset)\/3#article_i2\">[2]<\/a>.<\/p>\n<\/div>\n<p>How about using a similar method to create an index of files below a start directory to perform quick queries such as: &#8220;Which files have recently been modified?&#8221; &#8220;Which are the biggest wasters of space?&#8221; Or &#8220;Which file names match the following pattern?&#8221;<\/p>\n<p>Unix filesystems store metadata in inodes, which reside in flattened structures on disk that cause database-style queries to run at a snail&#8217;s pace. To take a look at a file&#8217;s metadata, run the\u00a0<code>stat<\/code>command on it and take a look at the file size and timestamps, such as the time of the last modification (<a class=\"figure\" href=\"http:\/\/www.linuxpromagazine.com\/Online\/Features\/Programming-Snapshot-Go#article_f2\">Figure 2<\/a>).<\/p>\n<div class=\"object-center\">\n<div class=\"imagecenter\"><img decoding=\"async\" alt=\"\" src=\"http:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2018\/10\/programming-snapshot-implementing-fast-queries-for-local-files-in-go.png\" \/><\/p>\n<p>Figure 2: Inode metadata of a file, here determined by stat, can be used to build an index.<\/p>\n<\/div>\n<\/div>\n<p>Newer filesystems like ZFS or Btrfs take a more database-like approach in the way they organize the files they contain but do not go far enough to be able to support meaningful queries from userspace.<\/p>\n<h4>Fast Forward Instead of Pause<\/h4>\n<p>For example, if you want to find all files over 100MB on the disk, you can do this with a\u00a0<code>find<\/code>\u00a0call like:<\/p>\n<pre class=\"auto\">\nfind \/ -type f -size +100M<\/pre>\n<p>If you are running the search on a traditional hard disk, take a coffee break. Even on a fast SSD, you need to prepare yourself for long search times in the minute range. The reason for this is that the data is scattered in a query-unfriendly way across the sectors of the disk.<\/p>\n<p>Read more at <a href=\"http:\/\/www.linuxpromagazine.com\/Online\/Features\/Programming-Snapshot-Go\">Linux Pro Magazine<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>To find files quickly in the deeply nested subdirectories of his home directory, Mike whips up a Go program to index file metadata in an SQLite database. &#8230;the GitHub Codesearch\u00a0[1]\u00a0project, with its indexer built in Go, at least lets you browse locally available repositories, index them, and then search for code snippets in a flash. [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":51989,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[40],"tags":[],"class_list":["post-51988","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-linux-freebsd-unix"],"_links":{"self":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/51988","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/comments?post=51988"}],"version-history":[{"count":0,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/51988\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/media\/51989"}],"wp:attachment":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/media?parent=51988"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/categories?post=51988"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/tags?post=51988"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}