Misplaced Pages

Apache Lucene

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

Apache Lucene is a free and open-source search engine software library , originally written in Java by Doug Cutting . It is supported by the Apache Software Foundation and is released under the Apache Software License . Lucene is widely used as a standard foundation for production search applications.

#471528

28-464: Lucene has been ported to other programming languages including Object Pascal , Perl , C# , C++ , Python , Ruby and PHP . Doug Cutting originally wrote Lucene in 1999. Lucene was his fifth search engine. He had previously written two while at Xerox PARC , one at Apple , and a fourth at Excite . It was initially available for download from its home at the SourceForge web site. It joined

56-436: A class library and framework for Mac programming that ran under both Think C and Think Pascal. This largely replaced MacApp as the de facto class library for Mac programming. Like Think C, this remained a market leader until the release of Metrowerks' PowerPlant , which was generally regarded to be superior. During the early-1990s, Think and Apple collaborated on a cross platform library known as Bedrock , but this effort

84-499: A command-line interface (CLI), Apache Module (Celerity), and CodeRunner (node.js like solution able to handle different scripts per port), besides the ability to compile and protect a script's source code . Here are several "Hello World" programs in different Object Pascal versions. Still supported in Delphi and Free Pascal. FPC also packages its own substitutes for the libraries/units. Delphi does not. The Free Pascal 1.0 series and

112-623: A distinct location as in other Object Pascal dialects. Many features have been introduced continuously to Object Pascal with extensions to Delphi and extensions to FreePascal. In reaction to criticism, Free Pascal has adopted generics with the same syntax as Delphi, provided Delphi compatibility mode is selected, and both Delphi (partial) and Free Pascal (more extensive) support operator overloading . Delphi has also introduced many other features since version 7, including generics. Whereas FreePascal tries to be compatible to Delphi in Delphi compatibility mode, it also usually introduced many new features to

140-666: A much faster compile– link –debug cycle, and tight integration of its tools. The last official release of Think Pascal was 4.01, in 1992. Symantec later released an unofficial version 4.5d4 at no charge. Apple dropped support for Object Pascal when they moved from Motorola 68000 series chips to IBM's PowerPC architecture in 1994. MacApp 3.0, had already been rewritten in C++ and ported to this platform. Metrowerks offered with CodeWarrior an Object Pascal compiler for Macintosh that targeted both 68k and PowerPC , both in their IDE and as MPW tools. Macintosh developers using Object Pascal had

168-713: A path to port to the PowerPC , even architecture after both Apple and Symantec dropped support. MacApp 2.0, written in Object Pascal, was ported to the PowerPC using CodeWarrior . In 1986, Borland introduced similar extensions, also named Object Pascal, to the Turbo Pascal product for the Macintosh, and in 1989 for Turbo Pascal 5.5 for DOS. When Borland refocused from DOS to Windows in 1994, they created

196-542: A successor to Turbo Pascal, named Delphi , and introduced a new set of extensions to create what is now known as the Delphi language. The development of Delphi started in 1993 and Delphi 1.0 was officially released in the United States on 14 February 1995. While code using the Turbo Pascal object model could still be compiled, Delphi featured a new syntax using the keyword class in preference to object ,

224-424: Is just an indexing and search library and does not contain crawling and HTML parsing functionality. However, several projects extend Lucene's capability: Object Pascal Object Pascal is an extension to the programming language Pascal that provides object-oriented programming (OOP) features such as classes and methods . The language was originally developed by Apple Computer as Clascal for

252-469: The Apache Solr search server joined as a Lucene sub-project, merging the developer communities. Version 4.0 was released on October 12, 2012. In March 2021, Lucene changed its logo, and Apache Solr became a top level Apache project again, independent from Lucene. While suitable for any application that requires full text indexing and searching capability, Lucene is recognized for its utility in

280-534: The Lisa Workshop development system. As Lisa gave way to Macintosh , Apple collaborated with Niklaus Wirth , the author of Pascal, to develop an officially standardized version of Clascal. This was renamed Object Pascal. Through the mid-1980s, Object Pascal was the main programming language for early versions of the MacApp application framework . The language lost its place as the main development language on

308-619: The Apache Software Foundation's Jakarta family of open-source Java products in September 2001 and became its own top-level Apache project in February 2005. The name Lucene is Doug Cutting's wife's middle name and her maternal grandmother's first name. Lucene formerly included a number of sub-projects, such as Lucene.NET, Mahout , Tika and Nutch . These three are now independent top-level projects. In March 2010,

SECTION 10

#1732771939472

336-505: The C standard that conformed more closely to the needs of Mac OS programming. After version 6, the OOP facilities were expanded to a full C++ implementation , and the product was rebranded Symantec C++ starting version 7, then under development by different authors. Version 8 brought support for compiling to PowerPC . Think's almost complete ownership of the Mac programming market was broken with

364-590: The Create constructor and a virtual Destroy destructor (and negating having to call the New and Dispose procedures), properties, method pointers, and some other things. These were inspired by the ISO working draft for object-oriented extensions, but many of the differences from Turbo Pascal's dialect (such as the draft's requirement that all methods be virtual ) were ignored. The Delphi language has continued to evolve over

392-578: The FPC textmode IDE are the largest open codebases in this dialect. Free Pascal 2.0 was rewritten in a more Delphi-like dialect, and the textmode IDE and related frameworks (Free Vision) are the only parts in the TP version of Object Pascal. Another example: This works based on pointer copy, unless there is a specific allocation for a deeper copy. Note that the object construct is still available in Delphi and Free Pascal. The method implementation can also be made in

420-539: The Mac in 1991 with the release of the C++ -based MacApp 3.0. Official support ended in 1996. Symantec also developed a compiler for Object Pascal for their Think Pascal product, which could compile programs much faster than Apple's own Macintosh Programmer's Workshop (MPW). Symantec then developed the Think Class Library (TCL), based on MacApp concepts, which could be called from both Object Pascal and THINK C . The Think suite largely displaced MPW as

448-521: The Macintosh user interface throughout and was extremely fast. It quickly became the de facto C environment on the Mac, and the related Think Pascal quickly did the same for Object Pascal development. THINK Technologies was later bought by Symantec Corporation and the product continued to be developed by the original author, Michael Kahl. Versions 3 and later were essentially a subset of C++ and supported basic object-oriented programming (OOP) concepts such as single inheritance , and extensions to

476-595: The PC into the early 2000s, and was partly displaced in the 2000s with the introduction of the .NET Framework which included Hejlsberg's C# . Pascal became a major language in the programming world in the 1970s, with high-quality implementations on most minicomputer platforms and microcomputers . Among the latter was the UCSD Pascal system, which compiled to an intermediate p-System code format that could then run on multiple platforms. Apple licensed UCSD and used it as

504-569: The basis for their Apple Pascal system for the Apple II and Apple III . Pascal became one of the major languages in the company in this period. With the start of the Apple Lisa project, Pascal was selected as the main programming language of the platform, although this time as a compiler in contrast to the p-System interpreter . Object Pascal is an extension of the Pascal language that

532-502: The developer tool business. THINK Reference was a proprietary documentation database and browser developed by Symantec for programmers on the classic Mac OS platform. It was included with the THINK C development environment sold by Symantec, and previously included with THINK Pascal. It contained a hypertext version of Apple Computer 's Macintosh Toolbox API specifications, along with illustrative code samples. THINK Reference

560-953: The implementation of Internet search engines and local, single-site searching. Lucene includes a feature to perform a fuzzy search based on edit distance . Lucene has also been used to implement recommendation systems. For example, Lucene's 'MoreLikeThis' Class can generate recommendations for similar documents. In a comparison of the term vector-based similarity approach of 'MoreLikeThis' with citation-based document similarity measures, such as co-citation and co-citation proximity analysis, Lucene's approach excelled at recommending documents with very similar structural characteristics and more narrow relatedness. In contrast, citation-based document similarity measures tended to be more suitable for recommending more broadly related documents, meaning citation-based approaches may be more suitable for generating serendipitous recommendations, as long as documents to be recommended contain in-text citations. Lucene itself

588-494: The introduction of the PowerPC -based Macs in the early 1990s. Although Symantic released updates that ran on these platforms, these were not released until the machines had been on the market for almost a year. In the meantime, Metrowerks ' product, CodeWarrior , took control of the market, being both faster and easier to use than Think's. Starting with version 4.0, Think included the Think Class Library (TCL),

SECTION 20

#1732771939472

616-408: The language that are not always available in Delphi. THINK C Think C (stylized as THINK C ), originally known as LightSpeed C , is an extension of the C programming language for the classic Mac OS developed by THINK Technologies, released first in mid-1986. THINK was founded by Andrew Singer, Frank Sinton and Mel Conway. LightSpeed C was widely lauded when it was released, as it used

644-574: The main development platform on the Mac in the late 1980s. Symantec ported Object Pascal to the PC, and developed a similar object framework on that platform. In contrast to TCL, which eventually migrated to C++, the PC libraries remained mainly based on Pascal. Borland added support for object-oriented programming to Turbo Pascal 5.5, which would eventually become the basis for the Object Pascal dialect used in Delphi created by Anders Hejlsberg . Delphi remained mainstream for business applications on

672-491: The project, which began very early in 1985 and became a product in 1986. An Object Pascal extension was also implemented in the Think Pascal integrated development environment (IDE). The IDE includes the compiler and an editor with syntax highlighting and checking, a powerful debugger , and a class library. Many developers preferred Think Pascal over Apple's implementation of Object Pascal because Think Pascal offered

700-663: The years to support constructs such as dynamic arrays , generics and anonymous methods . The old object syntax introduced by Apple ("Old-Style Object Types") is still supported. Object Pascal compilers are available for a wide range of operating systems and architectures. Pascal Script (formerly InnerFuse ) and DWScript (Delphi Web Script) are open-source Object Pascal interpreters and scripting engines written in Delphi. They support subsets of Object Pascal. DWScript can also compile Object Pascal code into JavaScript code (Smart Pascal), and supports just-in-time compilation (JIT). Modern Pascal provides 3 different interpreters:

728-517: Was abandoned in 1993, by which time PowerPlant was the clear market leader. Despite the decline in popularity of their IDE, Symantec was eventually chosen by Apple to provide next-generation C/C++ compilers for MPW in the form of Sc/Scpp for 68K alongside MrC/MrCpp for PowerPC. These remained Apple's standard compilers until the arrival of Mac OS X replaced them with the GNU Compiler Collection (GCC). Symantec subsequently exited

756-597: Was developed at Apple Computer by a team led by Larry Tesler in consultation with Niklaus Wirth , the inventor of Pascal. It is descended from an earlier object-oriented version of Pascal named Clascal , which was available on the Lisa computer. Object Pascal was needed to support MacApp , an expandable Macintosh application framework that would now be termed a class library . Object Pascal extensions, and MacApp, were developed by Barry Haynes, Ken Doyle, and Larry Rosenstein, and were tested by Dan Allen. Larry Tesler oversaw

784-670: Was discontinued in 1994. Bruce F. Webster of BYTE named Lightspeed C product of the month for September 1986. While criticizing the documentation as its "single greatest weakness", Webster stated that Lightspeed C would be the choice if he had to select one compiler for the Macintosh. BYTE in 1989 listed Lightspeed C as among the "Distinction" winners of the Byte Awards, stating that it "wins our respect because of its powerful features and low price". THINK C 5.0 obtained in 4 (out 5) rating in July 1992 issue of Macworld , praising

#471528