Misplaced Pages

BitFunnel

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

BitFunnel is the search engine indexing algorithm and a set of components used in the Bing search engine , which were made open source in 2016. BitFunnel uses bit-sliced signatures instead of an inverted index in an attempt to reduce operations cost.

#457542

44-615: Progress on the implementation of BitFunnel was made public in early 2016, with the expectation that there would be a usable implementation later that year. In September 2016, the source code was made available via GitHub . A paper discussing the BitFunnel algorithm and implementation was released as through the Special Interest Group on Information Retrieval of the Association for Computing Machinery in 2017 and won

88-455: A beta release . Its name was chosen as a compound of Git and hub . GitHub, Inc. was originally a flat organization with no middle managers, instead relying on self-management . Employees could choose to work on projects that interested them ( open allocation ), but the chief executive set salaries. In 2014, the company added a layer of middle management in response to serious harassment allegations against its senior leadership. As

132-576: A series B round . The lead investor was Sequoia Capital , and other investors were Andreessen Horowitz , Thrive Capital , IVP (Institutional Venture Partners) and other venture capital funds. The company was then valued at approximately $ 2 billion. As of 2023, GitHub was estimated to generate $ 1 billion in revenue. The GitHub service was developed by Chris Wanstrath , P. J. Hyett , Tom Preston-Werner , and Scott Chacon using Ruby on Rails , and started in February 2008. The company, GitHub, Inc.,

176-509: A static web hosting service for blogs , project documentation, and books. All GitHub Pages content is stored in a Git repository as files served to visitors verbatim or in Markdown format. GitHub is integrated with Jekyll static website and blog generator and GitHub continuous integration pipelines. Each time the content source is updated, Jekyll regenerates the website and automatically serves it via GitHub Pages infrastructure. Like

220-409: A Campus Expert, applicants must complete an online training course with multiple modules to develop community leadership skills. GitHub also provides some software as a service (SaaS) integrations for adding extra features to projects. Those services include: GitHub Sponsors allows users to make monthly money donations to projects hosted on GitHub. The public beta was announced on May 23, 2019, and

264-406: A cloud provider and has been available as of November 2011 . In November 2020, source code for GitHub Enterprise Server was leaked online in an apparent protest against DMCA takedown of youtube-dl . According to GitHub, the source code came from GitHub accidentally sharing the code with Enterprise customers themselves, not from an attack on GitHub servers. In 2008, GitHub introduced GitHub Pages,

308-559: A community, platform and business. Under Microsoft, the service was led by Xamarin 's Nat Friedman , reporting to Scott Guthrie , executive vice president of Microsoft Cloud and AI. Nat Friedman resigned November 3, 2021; he was replaced by Thomas Dohmke. There have been concerns from developers Kyle Simpson, JavaScript trainer and author, and Rafael Laguna, CEO at Open-Xchange over Microsoft's purchase, citing uneasiness over Microsoft's handling of previous acquisitions, such as Nokia's mobile business and Skype . This acquisition

352-555: A later time. In addition, GitHub supports the following formats and features: GitHub's Terms of Service do not require public software projects hosted on GitHub to meet the Open Source Definition . The terms of service state, "By setting your repositories to be viewed publicly, you agree to allow others to view and fork your repositories." GitHub Enterprise is a self-managed version of GitHub with similar functionality. It can be run on an organization's hardware or

396-734: A query for a document (Q) can be defined as a union: S Q → = ⋃ t ∈ Q S t → {\displaystyle {\overrightarrow {S_{Q}}}=\bigcup _{t\in Q}{\overrightarrow {S_{t}}}} Additionally, a document D is a member of the set M' when the following condition is satisfied: S Q → ∩ S D → = S Q → {\displaystyle {\overrightarrow {S_{Q}}}\cap {\overrightarrow {S_{D}}}={\overrightarrow {S_{Q}}}} This knowledge

440-577: A registered user account, users can have discussions, manage repositories, submit contributions to others' repositories, and review changes to code . GitHub began offering limited private repositories at no cost in January 2019 (limited to three contributors per project). Previously, only public repositories were free. On April 14, 2020, GitHub made "all of the core GitHub features" free for everyone, including "private repositories with unlimited collaborators." The fundamental software that underpins GitHub

484-522: A result of the scandal, Tom Preston-Werner resigned from his position as CEO. GitHub was a bootstrapped start-up business , which in its first years provided enough revenue to be funded solely by its three founders and start taking on employees. In July 2012, four years after the company was founded, Andreessen Horowitz invested $ 100 million in venture capital with a $ 750 million valuation. In July 2015 GitHub raised another $ 250 million (~$ 314 million in 2023) of venture capital in

SECTION 10

#1732787726458

528-416: A search engine are available for issue tracking. For version control, Git (and, by extension, GitHub) allows pull requests to propose changes to the source code. Users who can review the proposed changes can see a diff between the requested changes and approve them. In Git terminology, this action is called "committing" and one instance of it is a "commit." A history of all commits is kept and can be viewed at

572-510: A significant user of GitHub, using it to host open-source projects and development tools such as .NET Core , Chakra Core , MSBuild , PowerShell , PowerToys , Visual Studio Code , Windows Calculator , Windows Terminal and the bulk of its product documentation (now to be found on Microsoft Docs ). On June 4, 2018, Microsoft announced its intent to acquire GitHub for US$ 7.5 billion (~$ 8.96 billion in 2023). The deal closed on October 26, 2018. GitHub continued to operate independently as

616-401: A statement denying Horvath's allegations. However, following an internal investigation, GitHub confirmed the claims. GitHub's CEO Chris Wanstrath wrote on the company blog, "The investigation found Tom Preston-Werner in his capacity as GitHub's CEO acted inappropriately, including confrontational conduct, disregard of workplace complaints, insensitivity to the impact of his spouse's presence in

660-403: A total of 135,000 repositories. In 2010, GitHub was hosting 1 million repositories. A year later, this number doubled. ReadWriteWeb reported that GitHub had surpassed SourceForge and Google Code in total number of commits for the period of January to May 2011. On January 16, 2013, GitHub passed the 3 million users mark and was then hosting more than 5 million repositories. By the end of

704-491: A website that enables designers to market royalty-free digital images . The illustration GitHub chose was a character that Oxley had named Octopuss. Since GitHub wanted Octopuss for their logo (a use that the iStock license disallows), they negotiated with Oxley to buy exclusive rights to the image. GitHub renamed Octopuss to Octocat, and trademarked the character along with the new name. Later, GitHub hired illustrator Cameron McEfee to adapt Octocat for different purposes on

748-469: Is Git itself, written by Linus Torvalds , creator of Linux. The additional software that provides the GitHub user interface was written using Ruby on Rails and Erlang by GitHub, Inc. developers Wanstrath, Hyett, and Preston-Werner. The primary purpose of GitHub is to facilitate the version control and issue tracking aspects of software development. Labels, milestones, responsibility assignment, and

792-709: Is a developer platform that allows developers to create, store, manage and share their code. It uses Git software, which provides distributed version control of access control , bug tracking , software feature requests, task management , continuous integration , and wikis for every project. Headquartered in California , it has been a subsidiary of Microsoft since 2018. It is commonly used to host open source software development projects. As of January 2023 , GitHub reported having over 100 million developers and more than 420 million repositories , including at least 28 million public repositories. It

836-510: Is indicated on the gist page. GitHub launched a new program called the GitHub Student Developer Pack to give students free access to more than a dozen popular development tools and services. GitHub partnered with Bitnami , Crowdflower , DigitalOcean , DNSimple, HackHands , Namecheap , Orchestrate, Screenhero, SendGrid , Stripe , Travis CI , and Unreal Engine to launch the program. In 2016, GitHub announced

880-726: Is maintained with a map of keywords. In contrast, BitFunnel represents each searchable item through a signature. A signature is a sequence of bits which describe a Bloom filter of the searchable terms in a given searchable item. The bloom filter is constructed through hashing through several bit positions. The signature of a document (D) can be described as the logical-or of its term signatures: S D → = ⋃ t ∈ D S t → {\displaystyle {\overrightarrow {S_{D}}}=\bigcup _{t\in D}{\overrightarrow {S_{t}}}} Similarly,

924-424: Is required by law. This includes keeping public repositories services, including those for open source projects, available and accessible to support personal communications involving developers in sanctioned regions. Developers who feel that they should not have restrictions can appeal for the removal of said restrictions, including those who only travel to, and do not reside in, those countries. GitHub has forbidden

SECTION 20

#1732787726458

968-454: Is the world's largest source code host as of June 2023 . Over five billion developer contributions were made to more than 500 million open source projects in 2024. The development of the GitHub platform began on October 19, 2007. The site was launched in April 2008 by Tom Preston-Werner , Chris Wanstrath , P. J. Hyett and Scott Chacon after it had been available for a few months as

1012-575: Is then combined to produce a formula where M' is identified by documents which match the query signature: M ′ = { D ∈ C ∣ S Q → ∩ S D → = S Q → } {\displaystyle M'=\left\{D\in C\mid {\overrightarrow {S_{Q}}}\cap {\overrightarrow {S_{D}}}={\overrightarrow {S_{Q}}}\right\}} These steps and their proofs are discussed in

1056-505: Is usually used for larger projects. Tom Preston-Werner débuted the feature at a Ruby conference in 2008. Gist builds on the traditional simple concept of a pastebin by adding version control for code snippets, easy forking, and TLS encryption for private pastes. Because each "gist" is its own Git repository, multiple code snippets can be contained in a single page, and they can be pushed and pulled using Git. Unregistered users could upload Gists until March 19, 2018, when uploading Gists

1100-570: The Svalbard Global Seed Vault . The archive contained the code of all active public repositories, as well as that of dormant but significant public repositories. The 21 TB of data was stored on piqlFilm archival film reels as matrix (2D) barcode ( Boxing barcode ), and is expected to last 500–1,000 years. The GitHub Archive Program is also working with partners on Project Silica, in an attempt to store all public repositories for 10,000 years. It aims to write archives into

1144-975: The 2017 paper. This algorithm is described in the 2017 paper. M ′ = ∅ foreach   D ∈ C   do if   S D → ∩ S Q → = S Q →   then M ′ = M ′ ∪ { D } endif endfor {\displaystyle {\begin{array}{l}M'=\emptyset \\{\texttt {foreach}}\ D\in C\ {\texttt {do}}\\\qquad {\texttt {if}}\ {\overrightarrow {S_{D}}}\cap {\overrightarrow {S_{Q}}}={\overrightarrow {S_{Q}}}\ {\texttt {then}}\\\qquad \qquad M'=M'\cup \{D\}\\\qquad {\texttt {endif}}\\{\texttt {endfor}}\end{array}}} GitHub GitHub ( / ˈ ɡ ɪ t h ʌ b / )

1188-493: The Best Paper Award. BitFunnel consists of three major components: The BitFunnel paper describes the "matching problem", which occurs when an algorithm must identify documents through the usage of keywords. The goal of the problem is to identify a set of matches given a corpus to search and a query of keyword terms to match against. This problem is commonly solved through inverted indexes , where each searchable item

1232-525: The first year: it pledges to cover payment processing costs and match sponsorship payments up to $ 5,000 per developer. Furthermore, users can still use similar services like Patreon and Open Collective and link to their websites. In July 2020, GitHub stored a February archive of the site in an abandoned mountain mine in Svalbard , Norway, part of the Arctic World Archive and not far from

1276-536: The launch of the GitHub Campus Experts program to train and encourage students to grow technology communities at their universities. The Campus Experts program is open to university students 18 years and older worldwide. GitHub Campus Experts are one of the primary ways that GitHub funds student-oriented events and communities, Campus Experts are given access to training, funding, and additional resources to run events and grow their communities. To become

1320-433: The media through a spokesperson, saying: GitHub is subject to US trade control laws, and is committed to full compliance with applicable law. At the same time, GitHub's vision is to be the global platform for developer collaboration, no matter where developers reside. As a result, we take seriously our responsibility to examine government mandates thoroughly to be certain that users and customers are not impacted beyond what

1364-421: The molecular structure of quartz glass platters, using a high-precision petahertz pulse laser, i.e. one that pulses a quadrillion (1,000,000,000,000,000) times per second. In March 2014, GitHub programmer Julie Ann Horvath alleged that founder and CEO Tom Preston-Werner and his wife, Theresa, engaged in a pattern of harassment against her that led to her leaving the company. In April 2014, GitHub released

BitFunnel - Misplaced Pages Continue

1408-409: The project accepts waitlist registrations. The Verge said that GitHub Sponsors "works exactly like Patreon " because "developers can offer various funding tiers that come with different perks, and they'll receive recurring payments from supporters who want to access them and encourage their work" except with "zero fees to use the program." Furthermore, GitHub offers incentives for early adopters during

1452-422: The rest of GitHub, it includes free and paid service tiers. Websites generated through this service are hosted either as subdomains of the github.io domain or can be connected to custom domains bought through a third-party domain name registrar . GitHub Pages supports HTTPS encryption. GitHub also operates a pastebin -style site called Gist , which is for code snippets , as opposed to GitHub proper, which

1496-503: The sale bolstered interest in competitors: Bitbucket (owned by Atlassian ), GitLab and SourceForge (owned by BIZX, LLC) reported that they had seen spikes in new users intending to migrate projects from GitHub to their respective services. In September 2019, GitHub acquired Semmle , a code analysis tool. In February 2020, GitHub launched in India under the name GitHub India Private Limited. In March 2020, GitHub announced that it

1540-411: The site provides social networking -like functions such as feeds, followers, wikis (using wiki software called Gollum ), and a social network graph to display how developers work on their versions (" forks ") of a repository and what fork (and branch within that fork) is newest. Anyone can browse and download public repositories, but only registered users can contribute content to repositories. With

1584-569: The use of VPNs and IP proxies to access the site from sanctioned countries, as purchase history and IP addresses are how they flag users, among other sources. On December 4, 2014, Russia blacklisted GitHub.com because GitHub initially refused to take down user-posted suicide manuals. After a day, Russia withdrew its block, and GitHub began blocking specific content and pages in Russia. On December 31, 2014, India blocked GitHub.com along with 31 other websites over pro- ISIS content posted by users;

1628-454: The website and promotional materials; McEfee and various GitHub users have since created hundreds of variations of the character, which are available on The Octodex . Projects on GitHub can be accessed and managed using the standard Git command-line interface; all standard Git commands work with it. GitHub also allows users to browse public repositories on the site. Multiple desktop clients and Git plugins are also available. In addition,

1672-567: The workplace, and failure to enforce an agreement that his spouse should not work in the office." Preston-Werner subsequently resigned from the company. The firm then announced it would implement new initiatives and trainings "to make sure employee concerns and conflicts are taken seriously and dealt with appropriately." On July 25, 2019, a developer based in Iran wrote on Medium that GitHub had blocked his private repositories and prohibited access to GitHub pages. Soon after, GitHub confirmed that it

1716-557: The year, the number of repositories was twice as great, reaching 10 million repositories. In 2015, GitHub opened an office in Japan, its first outside of the U.S. On February 28, 2018, GitHub fell victim to the third-largest distributed denial-of-service (DDoS) attack in history, with incoming traffic reaching a peak of about 1.35 terabits per second. On June 19, 2018, GitHub expanded its GitHub Education by offering free education bundles to all schools. From 2012, Microsoft became

1760-497: Was acquiring npm , a JavaScript packaging vendor, for an undisclosed sum of money. The deal was closed on April 15, 2020. In early July 2020, the GitHub Archive Program was established to archive its open-source code in perpetuity. GitHub's mascot is an anthropomorphized "octocat" with five octopus-like arms . The character was created by graphic designer Simon Oxley as clip art to sell on iStock ,

1804-588: Was formed in 2007 and is located in San Francisco. On February 24, 2009, GitHub announced that within the first year of being online, GitHub had accumulated over 46,000 public repositories, 17,000 of which were formed in the previous month. At that time, about 6,200 repositories had been forked at least once, and 4,600 had been merged. That same year, the site was used by over 100,000 users, according to GitHub, and had grown to host 90,000 unique public repositories, 12,000 having been forked at least once, for

BitFunnel - Misplaced Pages Continue

1848-446: Was in line with Microsoft's business strategy under CEO Satya Nadella , which has seen a larger focus on cloud computing services, alongside the development of and contributions to open-source software. Harvard Business Review argued that Microsoft was intending to acquire GitHub to get access to its user base, so it can be used as a loss leader to encourage the use of its other development products and services. Concerns over

1892-452: Was now blocking developers in Iran , Crimea , Cuba , North Korea , and Syria from accessing private repositories. However, GitHub reopened access to GitHub Pages days later, for public repositories regardless of location. It was also revealed that using GitHub while visiting sanctioned countries could result in similar actions occurring on a user's account. GitHub responded to complaints and

1936-446: Was restricted to logged-in users, reportedly to mitigate spamming on the page of recent Gists. Gists' URLs use hexadecimal IDs, and edits to Gists are recorded in a revision history , which can show the text difference of thirty revisions per page with an option between a "split" and "unified" view. Like repositories, Gists can be forked, "starred", i.e., publicly bookmarked, and commented on. The count of revisions, stars, and forks

#457542