{"id":123523,"date":"2022-04-01T08:00:00","date_gmt":"2022-04-01T08:00:00","guid":{"rendered":"https:\/\/fedoramagazine.org\/?p=36107"},"modified":"2022-04-01T08:00:00","modified_gmt":"2022-04-01T08:00:00","slug":"using-sourcegraph-to-search-34000-fedora-repositories","status":"publish","type":"post","link":"https:\/\/sickgaming.net\/blog\/2022\/04\/01\/using-sourcegraph-to-search-34000-fedora-repositories\/","title":{"rendered":"Using Sourcegraph to Search 34,000+ Fedora Repositories"},"content":{"rendered":"<p>In October 2021, a Fedora Linux user <a href=\"https:\/\/lists.fedoraproject.org\/archives\/list\/legal@lists.fedoraproject.org\/thread\/CBCJHOSP36YXQKCVGWVL5MXU64LZ6NZA\/\" target=\"_blank\" rel=\"noreferrer noopener\">asked a question about licensing<\/a>. Fedora Project Leader Matthew Miller <a href=\"https:\/\/lists.fedoraproject.org\/archives\/list\/legal@lists.fedoraproject.org\/message\/LTIQS2PX33FSCEIAPJS62UZXVPDT5JPB\/\" target=\"_blank\" rel=\"noreferrer noopener\">left a response<\/a>: \u201cSince we don&#8217;t have a complete, exploded, searchable repository of all of the packages in Fedora, I don&#8217;t have a quick way to check.\u201d&nbsp;<\/p>\n<p><a href=\"https:\/\/lists.fedoraproject.org\/archives\/list\/legal@lists.fedoraproject.org\/message\/5GEPBSRGUK5E2FLW4MQBVP6DI65XP2LQ\/\" target=\"_blank\" rel=\"noreferrer noopener\">Followed by<\/a>: \u201c&#8230;or possibly pay Sourcegraph to do it for us. They seem like nice people.\u201d He is correct, we (Sourcegraph) <em>are<\/em> nice people, but we don\u2019t want your money. Instead, we wanted to team up with the Fedora community.<\/p>\n<p>The Fedora Community can now search their universe of open source code\u2014currently over 34,000 repositories and counting.<\/p>\n<p> <span id=\"more-36107\"><\/span> <\/p>\n<h2>Introduction to code search<\/h2>\n<p>For those who aren\u2019t familiar with the concept of <a href=\"https:\/\/codesearchguide.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">code search<\/a>, it enables teams to onboard to a new codebase and find answers faster, helps to identify security risks, and many other use cases. Sourcegraph has indexed over two-million repositories across multiple code hosts such as GitHub and GitLab. <strong>This article is going to focus strictly on code search for <em>src.fedoraproject.org<\/em>. <\/strong>Sourcegraph provides both a <a href=\"https:\/\/sourcegraph.com\/search?q=context:global+repo:%5Esrc.fedoraproject.org\/&amp;patternType=regexp\" target=\"_blank\" rel=\"noreferrer noopener\">web app <\/a>and <a href=\"https:\/\/docs.sourcegraph.com\/cli\/quickstart\" target=\"_blank\" rel=\"noreferrer noopener\">CLI<\/a> interface.<\/p>\n<h2>Using the Web app<\/h2>\n<p>When using the Sourcegraph <a href=\"https:\/\/sourcegraph.com\/search?q=context:global+repo:%5Esrc.fedoraproject.org\/&amp;patternType=regexp\" target=\"_blank\" rel=\"noreferrer noopener\">web app<\/a> you will need to start each search with <strong>repo:^src.fedoraprojects.org<\/strong> before entering any search queries. Using this link to the <a href=\"https:\/\/sourcegraph.com\/search?q=context:global+repo:%5Esrc.fedoraproject.org\/&amp;patternType=regexp\" target=\"_blank\" rel=\"noreferrer noopener\">web app<\/a> will include this initial string as shown here:<\/p>\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-10.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"335\" src=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories.png\" alt=\"\" class=\"wp-image-36186\" \/><\/a><figcaption>Sourcegraph web app interface<\/figcaption><\/figure>\n<p>The following sections will provide some web app examples of searches that might be of interest.<\/p>\n<h3>Find repositories using popular OSI-approved licenses&nbsp;<\/h3>\n<p>The following query will scan all the repositories for software that is compatible with the &#8220;Open Source Definition&#8221; (OSD).<\/p>\n<pre class=\"wp-block-preformatted\">repo:^src.fedoraproject.org\/ lang:\"RPM Spec\" License: ^.*apache|bsd|gpl|lgpl|mit|mpl|cddl|epl.*$<\/pre>\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-16.png\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"513\" src=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-1.png\" alt=\"\" class=\"wp-image-36228\" \/><\/a><figcaption>License search<\/figcaption><\/figure>\n<div class=\"wp-container-624a062956cb7 wp-block-buttons\">\n<div class=\"wp-block-button is-style-fill\"><a class=\"wp-block-button__link has-vivid-cyan-blue-background-color has-background\" href=\"https:\/\/sourcegraph.com\/search?q=context:global+repo:%5Esrc.fedoraproject.org\/+lang:%22RPM+Spec%22+License:+%5E.*apache%7Cbsd%7Cgpl%7Clgpl%7Cmit%7Cmpl%7Ccddl%7Cepl.*%24&amp;patternType=regexp\" target=\"_blank\" rel=\"noreferrer noopener\">Try it!<\/a><\/div>\n<\/div>\n<h3>Find files with TODOs<\/h3>\n<p>The following query can find TODOs in 34k repositories. This is great for those looking to contribute to projects that need help.<\/p>\n<pre class=\"wp-block-preformatted\">repo:^src.fedoraproject.org\/ \"TODO\"<\/pre>\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-20.png\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"605\" src=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-2.png\" alt=\"\" class=\"wp-image-36229\" \/><\/a><figcaption>Search for TODO<\/figcaption><\/figure>\n<div class=\"wp-container-624a062957177 wp-block-buttons\">\n<div class=\"wp-block-button is-style-outline\"><a class=\"wp-block-button__link has-white-color has-vivid-cyan-blue-background-color has-text-color has-background\" href=\"https:\/\/sourcegraph.com\/search?q=context:global+repo:%5Esrc.fedoraproject.org\/+%22TODO%22&amp;patternType=regexp&amp;case=yes\" target=\"_blank\" rel=\"noreferrer noopener\">Try it!<\/a><\/div>\n<\/div>\n<h3>Find files being served via FTP<\/h3>\n<p>A co-worker of mine from back in the day told me \u201cFTP is a dead protocol\u201d. Is it? You can add to this query to find any other protocol such as irc, https, etc.<\/p>\n<pre class=\"wp-block-preformatted\">repo:^src.fedoraproject.org\/ (?:ftp):\/\/[A-Za-z0-9-]{0,63}(.[A-Za-z0-9-]{0,63})+(:d{1,4})?\/*(\/*[A-Za-z0-9-._]+\/*)*(?.*)?(#.*)?<\/pre>\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-25.png\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"457\" src=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-3.png\" alt=\"\" class=\"wp-image-36230\" \/><\/a><figcaption>Search for protocol<\/figcaption><\/figure>\n<div class=\"wp-container-624a0629575cb wp-block-buttons\">\n<div class=\"wp-block-button is-style-outline\"><a class=\"wp-block-button__link has-white-color has-vivid-cyan-blue-background-color has-text-color has-background\" href=\"https:\/\/sourcegraph.com\/search?q=context:global+repo:%5Esrc.fedoraproject.org\/+%28%3F:ftp%29:%5C\/%5C\/%5BA-Za-z0-9%5C-%5D%7B0%2C63%7D%28%5C.%5BA-Za-z0-9%5C-%5D%7B0%2C63%7D%29%2B%28:%5Cd%7B1%2C4%7D%29%3F%5C\/*%28%5C\/*%5BA-Za-z0-9%5C-._%5D%2B%5C\/*%29*%28%5C%3F.*%29%3F%28%23.*%29%3F&amp;patternType=regexp\" target=\"_blank\" rel=\"noreferrer noopener\">Try it!<\/a><\/div>\n<\/div>\n<h3>Find files with a vulnerable version of Log4j<\/h3>\n<p>This query will find any files that are possibly vulnerable (false positives can happen) to CVE-2021-44228 aka Log4j. You can also search for other vulnerabilities that can then be reported to project maintainers.<\/p>\n<pre class=\"wp-block-preformatted\">repo:^src.fedoraproject.org\/ org.apache.logging.log4j 2.((0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15)(.[0-9]+)) count:all<\/pre>\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-30.png\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"295\" src=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-4.png\" alt=\"\" class=\"wp-image-36231\" \/><\/a><figcaption>Search for log4j<\/figcaption><\/figure>\n<div class=\"wp-container-624a0629579ed wp-block-buttons\">\n<div class=\"wp-block-button is-style-outline\"><a class=\"wp-block-button__link has-white-color has-vivid-cyan-blue-background-color has-text-color has-background\" href=\"https:\/\/sourcegraph.com\/search?q=context:global+repo:%5Esrc.fedoraproject.org\/+org%5C.apache%5C.logging%5C.log4j+2.%28%280%7C1%7C2%7C3%7C4%7C5%7C6%7C7%7C8%7C9%7C10%7C11%7C12%7C13%7C14%7C15%29%28%5C.%5B0-9%5D%2B%29%29+count:all&amp;patternType=regexp\" target=\"_blank\" rel=\"noreferrer noopener\">Try it!<\/a><\/div>\n<\/div>\n<h2>Use the CLI<\/h2>\n<p>Sourcegraph also has a command-line interface tool called <a href=\"https:\/\/github.com\/sourcegraph\/src-cli#readme\" target=\"_blank\" rel=\"noreferrer noopener\">src<\/a>, which allows you to do everything I just mentioned above, plus other useful commands like getting results in JSON for programmatic consumption.<\/p>\n<pre class=\"wp-block-preformatted\">src search -json 'repo:^src.fedoraproject.org\/ lang:\"RPM Spec\" License: ^.*apache|bsd|g\npl|lgpl|mit|mpl|cddl|epl.*$'<\/pre>\n<h3>JSON output<\/h3>\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-34.png\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"521\" src=\"https:\/\/www.sickgaming.net\/blog\/wp-content\/uploads\/2022\/04\/using-sourcegraph-to-search-34000-fedora-repositories-5.png\" alt=\"\" class=\"wp-image-36158\" \/><\/a><figcaption>JSON output<\/figcaption><\/figure>\n<div class=\"wp-container-624a062957e3b wp-block-buttons\">\n<div class=\"wp-block-button is-style-outline\"><a class=\"wp-block-button__link has-white-color has-vivid-cyan-blue-background-color has-text-color has-background\" href=\"https:\/\/sourcegraph.com\/notebooks\/Tm90ZWJvb2s6MzQ2\" target=\"_blank\" rel=\"noreferrer noopener\">Try it!<\/a><\/div>\n<\/div>\n<h2>Search Syntax<\/h2>\n<p>The examples shown may be a good starting point but are by no means the only queries that may be made. You can <a href=\"https:\/\/docs.sourcegraph.com\/code_search\/reference\/queries\" target=\"_blank\" rel=\"noreferrer noopener\">view all search query syntaxes<\/a> and create your own as needed.<\/p>\n<h2>Conclusion<\/h2>\n<p>As you can see, with Sourcegraph, the Fedora Linux community can now quickly search for all code hosted at <a href=\"https:\/\/src.fedoraproject.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">src.fedoraproject.org<\/a>, regardless of whether they are literal or complex regex queries.<\/p>\n<p>I appreciate the Fedora Linux community being so helpful and welcoming. If you have anything you want to add or questions, my team and I will be in the comments section below. You can also <a href=\"https:\/\/srcgr.ph\/wp-join-community-space\" target=\"_blank\" rel=\"noreferrer noopener\">join us on Slack<\/a>.<\/p>\n<p>Special thanks to <a href=\"https:\/\/twitter.com\/vanesacodes\" target=\"_blank\" rel=\"noreferrer noopener\">Vanesa Ortiz<\/a> for <a href=\"https:\/\/discussion.fedoraproject.org\/t\/fedora-sourcegraph-marketing-community-collaboration\/36151\" target=\"_blank\" rel=\"noreferrer noopener\">making this collaboration happen<\/a>, <a href=\"https:\/\/handbook.sourcegraph.com\/team\/#ben-venker\" target=\"_blank\" rel=\"noreferrer noopener\">Ben Venker<\/a> for his help fixing my broken regex (multiple times), as well as <a href=\"https:\/\/handbook.sourcegraph.com\/team\/#rebecca-dodd\" target=\"_blank\" rel=\"noreferrer noopener\">Rebecca Dodd<\/a> and <a href=\"https:\/\/twitter.com\/nickwritesit\" target=\"_blank\" rel=\"noreferrer noopener\">Nick Moore<\/a> for their help with editing.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In October 2021, a Fedora Linux user asked a question about licensing. Fedora Project Leader Matthew Miller left a response: \u201cSince we don&#8217;t have a complete, exploded, searchable repository of all of the packages in Fedora, I don&#8217;t have a quick way to check.\u201d&nbsp; Followed by: \u201c&#8230;or possibly pay Sourcegraph to do it for us. [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":123524,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[48],"tags":[45,61,46,47,77],"class_list":["post-123523","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-fedora-os","tag-fedora","tag-fedora-project-community","tag-magazine","tag-news","tag-software"],"_links":{"self":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/123523","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/comments?post=123523"}],"version-history":[{"count":0,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/123523\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/media\/123524"}],"wp:attachment":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/media?parent=123523"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/categories?post=123523"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/tags?post=123523"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}