ImageNet - Misplaced Pages

The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided. ImageNet contains more than 20,000 categories, with a typical category, such as "balloon" or "strawberry", consisting of several hundred images. The database of annotations of third-party image URLs is freely available directly from ImageNet, though the actual images are not owned by ImageNet. Since 2010, the ImageNet project runs an annual software contest, the ImageNet Large Scale Visual Recognition Challenge ( ILSVRC ), where software programs compete to correctly classify and detect objects and scenes. The challenge uses a "trimmed" list of one thousand non-overlapping classes.

#88911

70-458: AI researcher Fei-Fei Li began working on the idea for ImageNet in 2006. At a time when most AI research focused on models and algorithms, Li wanted to expand and improve the data available to train AI algorithms. In 2007, Li met with Princeton professor Christiane Fellbaum , one of the creators of WordNet , to discuss the project. As a result of this meeting, Li went on to build ImageNet starting from

140-429: A Bachelor of Arts with a major in physics from Princeton University in 1999. Li completed her senior thesis, titled "Auditory binaural correlogram difference: a new computational model for huggins dichotic pitch," under the supervision of Bradley Dickinson, professor of electrical engineering. During her years at Princeton, she returned home most weekends to help run her family's dry cleaning business and worked as

210-505: A cognate of the word substantive as the basic term for noun (for example, Spanish sustantivo , "noun"). Nouns in the dictionaries of such languages are demarked by the abbreviation s. or sb. instead of n. , which may be used for proper nouns or neuter nouns instead. In English, some modern authors use the word substantive to refer to a class that includes both nouns (single words) and noun phrases (multiword units that are sometimes called noun equivalents ). It can also be used as

280-511: A convolutional neural network (CNN) called AlexNet achieved a top-5 error of 15.3% in the ImageNet 2012 Challenge, more than 10.8 percentage points lower than that of the runner up. Using convolutional neural networks was feasible due to the use of graphics processing units (GPUs) during training, an essential ingredient of the deep learning revolution. According to The Economist , "Suddenly people started to pay attention, not just within

350-581: A dishwasher to supplement the family income. Li then pursued her graduate studies at the California Institute of Technology , where she received a master's degree in electrical engineering in 2001 and a Doctor of Philosophy in electrical engineering in 2005. Li completed her dissertation, titled "Visual Recognition: Computational Models and Human Psychophysics," under the primary supervision of Pietro Perona and secondary supervision of Christof Koch . Her graduate studies were supported by

420-624: A noun is a word that represents a concrete or abstract thing, such as living creatures, places, actions, qualities, states of existence, and ideas. A noun may serve as an object or subject within a phrase, clause, or sentence. In linguistics , nouns constitute a lexical category ( part of speech ) defined according to how its members combine with members of other lexical categories. The syntactic occurrence of nouns differs among languages. In English, prototypical nouns are common nouns or proper nouns that can occur with determiners , articles and attributive adjectives , and can function as

490-544: A or an (in languages that have such articles). Examples of count nouns are chair , nose , and occasion . Mass nouns or uncountable ( non-count ) nouns differ from count nouns in precisely that respect: they cannot take plurals or combine with number words or the above type of quantifiers. For example, the forms a furniture and three furnitures are not used – even though pieces of furniture can be counted. The distinction between mass and count nouns does not primarily concern their corresponding referents but more how

560-445: A person , place , thing , event , substance , quality , quantity , etc., but this manner of definition has been criticized as uninformative. Several English nouns lack an intrinsic referent of their own: behalf (as in on behalf of ), dint ( by dint of ), and sake ( for the sake of ). Moreover, other parts of speech may have reference-like properties: the verbs to rain or to mother , or adjectives like red ; and there

630-826: A "WordNet ID" (wnid), which is a concatenation of part of speech and an "offset" (a unique identifying number ). Every wnid starts with "n" because ImageNet only includes nouns . For example, the wnid of synset " dog, domestic dog, Canis familiaris " is "n02084071". The categories in ImageNet fall into 9 levels, from level 1 (such as "mammal") to level 9 (such as "German shepherd"). The images were scraped from online image search ( Google , Picsearch , MSN , Yahoo , Flickr , etc) using synonyms in multiple languages. For example: German shepherd, German police dog, German shepherd dog, Alsatian, ovejero alemán, pastore tedesco, 德国牧羊犬 . ImageNet consists of images in RGB format with varying resolutions. For example, in ImageNet 2012, "fish" category,

700-512: A bounding box around the (visible part of the) indicated object. ImageNet uses a variant of the broad WordNet schema to categorize objects, augmented with 120 categories of dog breeds to showcase fine-grained classification. In 2012, ImageNet was the world's largest academic user of Mechanical Turk . The average worker identified 50 images per minute. The original plan of the full ImageNet would have roughly 50M clean, diverse and full resolution images spread over approximately 50K synsets. This

770-411: A counterpart to attributive when distinguishing between a noun being used as the head (main word) of a noun phrase and a noun being used as a noun adjunct . For example, the noun knee can be said to be used substantively in my knee hurts , but attributively in the patient needed knee replacement . A noun can co-occur with an article or an attributive adjective . Verbs and adjectives cannot. In

SECTION 10

#1732794494089

840-422: A dense grid of HoG and LBP , sparsified by local coordinate coding and pooling. It achieved 52.9% in classification accuracy and 71.8% in top-5 accuracy. It was trained for 4 days on three 8-core machines (dual quad-core 2GHz Intel Xeon CPU). The second competition in 2011 had fewer teams, with another SVM winning at top-5 error rate 25%. The winning team was XRCE by Florent Perronnin, Jorge Sanchez. The system

910-704: A few levers of change. Li has been described as a "researcher bringing humanity to AI." Li was elected as a member of the American Academy of Arts and Sciences in 2021, the National Academy of Engineering in 2020, and the National Academy of Medicine in 2020. Li works on artificial intelligence, machine learning, computer vision, cognitive neuroscience , and computational neuroscience . She has published more than 300 peer-reviewed research papers. Her work appears in computer science and neuroscience journals including Nature , Proceedings of

980-463: A language. Nouns may be classified according to morphological properties such as which prefixes or suffixes they take, and also their relations in syntax – how they combine with other words and expressions of various types. Many such classifications are language-specific, given the obvious differences in syntax and morphology. In English for example, it might be noted that nouns are words that can co-occur with definite articles (as stated at

1050-467: A larger number of categories, and also (unlike the programs) can judge the context of an image. It is estimated that over 6% of labels in the ImageNet-1k validation set are wrong. It is also found that around 10% of ImageNet-1k contains ambiguous or erroneous labels, and that, when presented with a model's prediction and the original ImageNet label, human annotators prefer the prediction of a state of

1120-617: A nonprofit organization working to increase diversity and inclusion in the field of artificial intelligence. Her research expertise includes artificial intelligence , machine learning , deep learning , computer vision and cognitive neuroscience . Li was named in the Time 100 AI Most Influential People list in 2023 and received the Intel Lifetime Achievements Innovation Award in the same year for her contributions to artificial intelligence. She

1190-417: A noun that represents a unique entity ( India , Pegasus , Jupiter , Confucius , Pequod ) – as distinguished from common nouns (or appellative nouns ), which describe a class of entities ( country , animal , planet , person , ship ). In Modern English, most proper nouns – unlike most common nouns – are capitalized regardless of context ( Albania , Newton , Pasteur , America ), as are many of

1260-685: A poster at the 2009 Conference on Computer Vision and Pattern Recognition (CVPR) in Florida, titled "ImageNet: A Preview of a Large-scale Hierarchical Dataset". The poster was reused at Vision Sciences Society 2009. In 2009, Alex Berg suggested adding object localization as a task. Li approached PASCAL Visual Object Classes contest in 2009 for a collaboration. It resulted in the subsequent ImageNet Large Scale Visual Recognition Challenge starting in 2010, which has 1000 classes and object localization, as compared to PASCAL VOC which had just 20 classes and 19,737 images (in 2010). On 30 September 2012,

1330-586: A singular or a plural verb and referred to by a singular or plural pronoun, the singular being generally preferred when referring to the body as a unit and the plural often being preferred, especially in British English, when emphasizing the individual members. Examples of acceptable and unacceptable use given by Gowers in Plain Words include: Concrete nouns refer to physical entities that can, in principle at least, be observed by at least one of

1400-401: A specific sex. The gender of a pronoun must be appropriate for the item referred to: "The girl said the ring was from her new boyfriend , but he denied it was from him " (three nouns; and three gendered pronouns: or four, if this her is counted as a possessive pronoun ). A proper noun (sometimes called a proper name , though the two terms normally have different meanings) is

1470-470: A subclass of nouns parallel to prototypical nouns ). For example, in the sentence "Gareth thought she was weird", the word she is a pronoun that refers to a person just as the noun Gareth does. The word one can replace parts of noun phrases, and it sometimes stands in for a noun. An example is given below: But one can also stand in for larger parts of a noun phrase. For example, in the following example, one can stand in for new car . Nominalization

SECTION 20

#1732794494089

1540-501: A trained model. In 2021, ImageNet-1k was updated by annotating faces appearing in the 997 non-person categories. They found training models on the dataset with these faces blurred caused minimal loss in performance. ImageNetV2 was a new dataset containing three test sets with 10,000 each, constructed by the same methodology as the original ImageNet. ImageNet-21K-P was a filtered and cleaned subset of ImageNet-21K, with 12,358,688 images from 11,221 categories. The ILSVRC aims to "follow in

1610-497: A turning point. AI’s influence continues to grow, but representation and inclusion of a diversity of researchers in the field does not. It’s critical that we seize this moment to create structures that will support long-term, positive changes. This won’t happen via a single mechanism or quick fix. It starts with early education and extends to the existing structures of power within academia, work cultures among current AI researchers, and gatekeeping functions of research publishing, to name

1680-578: Is a leaf category, meaning that there are no child nodes below it, unlike ImageNet-21K. For example, in ImageNet-21K, there are some images categorized as simply "mammal", whereas in ImageNet-1K, there are only images categorized as things like "German shepherd", since there are no child-words below "German shepherd". In 2021 winter, ImageNet-21k was updated. 2,702 categories in the "person" subtree were filtered to prevent "problematic behaviors" in

1750-748: Is a phrase usually headed by a common noun, a proper noun, or a pronoun. The head may be the only constituent, or it may be modified by determiners and adjectives . For example, "The dog sat near Ms Curtis and wagged its tail" contains three NPs: the dog (subject of the verbs sat and wagged ); Ms Curtis (complement of the preposition near ); and its tail (object of wagged ). "You became their teacher" contains two NPs: you (subject of became ); and their teacher . Nouns and noun phrases can typically be replaced by pronouns , such as he, it, she, they, which, these , and those , to avoid repetition or explicit identification, or for other reasons (but as noted earlier, current theory often classifies pronouns as

1820-494: Is a process whereby a word that belongs to another part of speech comes to be used as a noun. This can be a way to create new nouns, or to use other words in ways that resemble nouns. In French and Spanish, for example, adjectives frequently act as nouns referring to people who have the characteristics denoted by the adjective. This sometimes happens in English as well, as in the following examples: For definitions of nouns based on

1890-488: Is deeply against my principles to work on any project that I think is to weaponize AI." In the fall of 2018, Li left Google and returned to Stanford University to continue her professorship. Li is also known for her non-profit work as the co-founder and chairperson of nonprofit organization AI4ALL, whose mission is to educate the next generation of AI technologists, thinkers and leaders by promoting diversity and inclusion through human-centered AI principles. The program

1960-527: Is derived from the Latin term, through the Anglo-Norman nom (other forms include nomme , and noun itself). The word classes were defined partly by the grammatical forms that they take. In Sanskrit, Greek, and Latin, for example, nouns are categorized by gender and inflected for case and number . Because adjectives share these three grammatical categories , adjectives typically were placed in

2030-442: Is little difference between the adverb gleefully and the prepositional phrase with glee . A functional approach defines a noun as a word that can be the head of a nominal phrase, i.e., a phrase with referential function, without needing to go through morphological transformation. Nouns can have a number of different properties and are often sub-categorized based on various of these criteria, depending on their occurrence in

2100-537: Is now known as the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). The ILSVRC uses a "trimmed" list of only 1000 image categories or "classes", including 90 of the 120 dog breeds classified by the full ImageNet schema. The 2010s saw dramatic progress in image processing. The first competition in 2010 had 11 participating teams. The winning team was a linear support vector machine (SVM). The features are

2170-452: Is referred to as ImageNet-21K. ImageNet-21k contains 14,197,122 images divided into 21,841 classes. Some papers round this up and name it ImageNet-22k. The full ImageNet-21k was released in Fall of 2011, as fall11_whole.tar . There is no official train-validation-test split for ImageNet-21k. Some classes contain only 1-10 samples, while others contain thousands. There are various subsets of

ImageNet - Misplaced Pages Continue

2240-480: Is the ImageNet project, which has revolutionized the field of large-scale visual recognition. Li has led the team of students and collaborators to organize the international competition on ImageNet recognition tasks called ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) between 2010 and 2017 in the academic community. Li's research in computer vision contributed to a line of work called Natural Scene Understanding, or later, story-telling of images. She

2310-491: Is the categories may be more "elevated" than would be optimal for ImageNet: "Most people are more interested in Lady Gaga or the iPod Mini than in this rare kind of diplodocus ." Fei-Fei Li Fei-Fei Li ( Chinese : 李飞飞 ; pinyin : Lǐ Fēifēi ; born July 3, 1976) is a Chinese-American computer scientist , known for establishing ImageNet , the dataset that enabled rapid advances in computer vision in

2380-424: Is to offer independent perspectives on emerging trends that intersect science, technology, ethics, governance, and sustainable development . It is designed to act as a central hub for a network of scientific networks, enhancing the integration of scientific insights into UN decision-making processes. Li is married to Stanford professor Silvio Savarese. They have a son and a daughter. Noun In grammar ,

2450-772: The National Science Foundation Graduate Research Fellowship and The Paul & Daisy Soros Fellowships for New Americans . From 2005 to 2006, Li was an assistant professor in the Electrical and Computer Engineering Department at the University of Illinois Urbana-Champaign , and from 2007 to 2009, she was an assistant professor in the Computer Science Department at Princeton University. She joined Stanford in 2009 as an assistant professor, and

2520-419: The head of a noun phrase . According to traditional and popular classification, pronouns are distinct from nouns, but in much modern theory they are considered a subclass of nouns. Every language has various linguistic and grammatical distinctions between nouns and verbs . Word classes (parts of speech) were described by Sanskrit grammarians from at least the 5th century BC. In Yāska 's Nirukta ,

2590-441: The senses ( chair , apple , Janet , atom ), as items supposed to exist in the physical world. Abstract nouns , on the other hand, refer to abstract objects : ideas or concepts ( justice , anger , solubility , duration ). Some nouns have both concrete and abstract meanings: art usually refers to something abstract ("Art is important in human culture"), but it can also refer to a concrete item ("I put my daughter's art up on

2660-443: The sex or social gender of the noun's referent, particularly in the case of nouns denoting people (and sometimes animals), though with exceptions (the feminine French noun personne can refer to a male or a female person). In Modern English, even common nouns like hen and princess and proper nouns like Alicia do not have grammatical gender (their femininity has no relevance in syntax), though they denote persons or animals of

2730-725: The 2010s. She is the Sequoia Capital professor of computer science at Stanford University and former board director at Twitter . Li is a co-director of the Stanford Institute for Human-Centered Artificial Intelligence and a co-director of the Stanford Vision and Learning Lab. She served as the director of the Stanford Artificial Intelligence Laboratory from 2013 to 2018. In 2017, she co-founded AI4ALL,

2800-445: The AI community but across the technology industry as a whole." In 2015, AlexNet was outperformed by Microsoft 's very deep CNN with over 100 layers, which won the ImageNet 2015 contest. ImageNet crowdsources its annotation process. Image-level annotations indicate the presence or absence of an object class in an image, such as "there are tigers in this image" or "there are no tigers in this image". Object-level annotations provide

2870-537: The ImageNet dataset used in various context, sometimes referred to as "versions". One of the most highly used subset of ImageNet is the "ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012–2017 image classification and localization dataset". This is also referred to in the research literature as ImageNet-1K or ILSVRC2017, reflecting the original ILSVRC challenge that involved 1,000 classes. ImageNet-1K contains 1,281,167 training images, 50,000 validation images and 100,000 test images. Each category in ImageNet-1K

ImageNet - Misplaced Pages Continue

2940-410: The National Academy of Sciences , Journal of Neuroscience , Conference on Computer Vision and Pattern Recognition , International Conference on Computer Vision , Conference on Neural Information Processing Systems , European Conference on Computer Vision , International Journal of Computer Vision , and IEEE Transactions on Pattern Analysis and Machine Intelligence . Among her best-known work

3010-508: The Stanford course CS231n on "Deep Learning for Computer Vision," whose 2015 version was previously online at Coursera . She has also taught CS131, an introductory class on computer vision. In May 2020, Li joined the board of directors of Twitter as an independent director. On October 27, 2022, following Elon Musk ’s purchase of the company, Li and eight others were removed from Twitter's nine-member board of directors, leaving Elon as

3080-759: The adjectives happy and serene ; circulation from the verb circulate ). Illustrating the wide range of possible classifying principles for nouns, the Awa language of Papua New Guinea regiments nouns according to how ownership is assigned: as alienable possession or inalienable possession. An alienably possessed item (a tree, for example) can exist even without a possessor. But inalienably possessed items are necessarily associated with their possessor and are referred to differently, for example with nouns that function as kin terms (meaning "father", etc.), body-part nouns (meaning "shadow", "hair", etc.), or part–whole nouns (meaning "top", "bottom", etc.). A noun phrase (or NP )

3150-410: The art model in 2020 trained on the original ImageNet, suggesting that ImageNet-1k has been saturated. A study of the history of the multiple layers ( taxonomy , object classes and labeling) of ImageNet and WordNet in 2019 described how bias is deeply embedded in most classification approaches for all sorts of images. ImageNet is working to address various sources of bias. One downside of WordNet use

3220-566: The barrier for entrance to businesses and developers, including the developments of products like AutoML. In September 2017, Google secured a contract from the Department of Defense called Project Maven , which aimed to use AI techniques to interpret images captured by drone cameras. Google told employees who protested the company's work on Project Maven that their role was "specifically scoped to be for non-offensive purposes." In June 2018, Google told employees it would not seek renewal of

3290-557: The contract. In internal emails which were later leaked to reporters, Li expressed enthusiasm for the Google Cloud role in Project Maven, but warned against mentioning its AI component, saying that military AI is linked in the public mind with the danger of autonomous weapons . Asked about those leaked emails, Li told The New York Times , "I believe in human-centered AI to benefit people in positive and benevolent ways. It

3360-403: The dataset is expected to be smaller. The applications of progress in this area would range from robotic navigation to augmented reality . By 2015, researchers at Microsoft reported that their CNNs exceeded human ability at the narrow ILSVRC tasks. However, as one of the challenge's organizers, Olga Russakovsky , pointed out in 2015, the contest is over only 1000 categories; humans can recognize

3430-401: The definite article is le for masculine nouns and la for feminine; adjectives and certain verb forms also change (sometimes with the simple addition of -e for feminine). Grammatical gender often correlates with the form of the noun and the inflection pattern it follows; for example, in both Italian and Romanian most nouns ending in -a are feminine. Gender can also correlate with

3500-525: The following, an asterisk (*) in front of an example means that this example is ungrammatical. Nouns have sometimes been characterized in terms of the grammatical categories by which they may be varied (for example gender , case , and number ). Such definitions tend to be language-specific, since different languages may apply different categories. Nouns are frequently defined, particularly in informal contexts, in terms of their semantic properties (their meanings). Nouns are described as words that refer to

3570-501: The footsteps" of the smaller-scale PASCAL VOC challenge, established in 2005, which contained only about 20,000 images and twenty object classes. To "democratize" ImageNet, Fei-Fei Li proposed to the PASCAL VOC team a collaboration, beginning in 2010, where research teams would evaluate their algorithms on the given data set, and compete to achieve higher accuracy on several visual recognition tasks. The resulting annual competition

SECTION 50

#1732794494089

3640-474: The forms that are derived from them (the common noun in "he's an Albanian "; the adjectival forms in "he's of Albanian heritage" and " Newtonian physics", but not in " pasteurized milk"; the second verb in "they sought to Americanize us"). Count nouns or countable nouns are common nouns that can take a plural , can combine with numerals or counting quantifiers (e.g., one , two , several , every , most ), and can take an indefinite article such as

3710-482: The fridge"). A noun might have a literal (concrete) and also a figurative (abstract) meaning: "a brass key " and "the key to success"; "a block in the pipe" and "a mental block ". Similarly, some abstract nouns have developed etymologically by figurative extension from literal roots ( drawback , fraction , holdout , uptake ). Many abstract nouns in English are formed by adding a suffix ( -ness , -ity , -ion ) to adjectives or verbs ( happiness and serenity from

3780-414: The labeling. They had enough budget to have each of the 14 million images labelled three times. The original plan called for 10,000 images per category, for 40,000 categories at 400 million images, each verified 3 times. They found that humans can classify at most 2 images/sec. At this rate, it was estimated to take 19 human-years of labor (without rest). They presented their database for the first time as

3850-571: The mean and standard deviations, for ImageNet, so these whitens the input data. Each image is labelled with exactly one wnid. Dense SIFT features (raw SIFT descriptors, quantized codewords, and coordinates of each descriptor/codeword) for ImageNet-1K were available for download, designed for bag of visual words . The bounding boxes of objects were available for about 3000 popular synsets with on average 150 images in each synset. Furthermore, some images have attributes. They released 25 attributes for ~400 popular synsets: The full original dataset

3920-596: The noun ( nāma ) is one of the four main categories of words defined. The Ancient Greek equivalent was ónoma (ὄνομα), referred to by Plato in the Cratylus dialog , and later listed as one of the eight parts of speech in The Art of Grammar , attributed to Dionysius Thrax (2nd century BC). The term used in Latin grammar was nōmen . All of these terms for "noun" were also words meaning "name". The English word noun

3990-467: The nouns present those entities. Many nouns have both countable and uncountable uses; for example, soda is countable in "give me three sodas", but uncountable in "he likes soda". Collective nouns are nouns that – even when they are treated in their morphology and syntax as singular – refer to groups consisting of more than one individual or entity. Examples include committee , government , and police . In English these nouns may be followed by

4060-482: The resolution ranges from 4288 x 2848 to 75 x 56. In machine learning, these are typically preprocessed into a standard constant resolution, and whitened, before further processing by neural networks. For example, in PyTorch, ImageNet images are by default normalized by dividing the pixel values so that they fall between 0 and 1, then subtracting by [0.485, 0.456, 0.406], then dividing by [0.229, 0.224, 0.225]. These are

4130-564: The roughly 22,000 nouns of WordNet and using many of its features. She was also inspired by a 1987 estimate that the average person recognizes roughly 30,000 different kinds of objects. As an assistant professor at Princeton , Li assembled a team of researchers to work on the ImageNet project. They used Amazon Mechanical Turk to help with the classification of images. Labeling started in July 2008 and ended in April 2010. It took 2.5 years to complete

4200-458: The same class as nouns. Similarly, the Latin term nōmen includes both nouns (substantives) and adjectives, as originally did the English word noun , the two types being distinguished as nouns substantive and nouns adjective (or substantive nouns and adjective nouns , or simply substantives and adjectives ). (The word nominal is now sometimes used to denote a class that includes both nouns and adjectives.) Many European languages use

4270-675: The sole director. On 3 August 2023, Li Fei Fei was announced as a member of the United Nations (UN) Scientific Advisory Board, established by Secretary-General António Guterres. She is among seven external scientists on this board, which also includes the Chief Scientists from various UN agencies, the UN University Rector, and the Secretary-General’s Envoy on Technology. The board's primary aim

SECTION 60

#1732794494089

4340-474: The start of an industry-wide artificial intelligence boom. By 2014, more than fifty institutions participated in the ILSVRC. In 2017, 29 of 38 competing teams had greater than 95% accuracy. In 2017 ImageNet stated it would roll out a new, much more difficult challenge in 2018 that involves classifying 3D objects using natural language. Because creating 3D data is more costly than annotating a pre-existing 2D image,

4410-461: The start of this article), but this could not apply in Russian , which has no definite articles. In some languages common and proper nouns have grammatical gender, typically masculine, feminine, and neuter. The gender of a noun (as well as its number and case, where applicable) will often require agreement in words that modify or are used along with it. In French for example, the singular form of

4480-399: Was another linear SVM, running on quantized Fisher vectors . It achieved 74.2% in top-5 accuracy. In 2012, a deep convolutional neural net called AlexNet achieved 84.7% in top-5 accuracy, a great leap forward. In the next couple of years, top-5 accuracy grew to above 90%. While the 2012 breakthrough "combined pieces that were all there before", the dramatic quantitative improvement marked

4550-653: Was born in Beijing , China in 1976 and grew up in Chengdu , Sichuan . She studied at Sichuan Chengdu No.7 High School . When she was 12, her father emigrated to Parsippany, New Jersey from China. When she was 16, she and her mother joined him in the United States. She graduated from Parsippany High School in 1995. She was inducted to the hall of fame at Parsippany High School in 2017. Li pursued her undergraduate studies at Princeton University . She received

4620-873: Was created in collaboration with Melinda French Gates and Jensen Huang . Prior to establishing AI4ALL in 2017, Li and her former student Olga Russakovsky , currently an assistant professor in Princeton University, co-founded and co-directed the precursor program at Stanford called SAILORS (Stanford AI Lab OutReach Summers). SAILORS was an annual summer camp at Stanford dedicated to 9th grade high school girls in AI education and research, established in 2015 till it changed its name to AI4ALL @Stanford in 2017. In 2018, AI4ALL has successfully launched five more summer programs in addition to Stanford, including Princeton University , Carnegie Mellon University , Boston University , University of California Berkeley , and Canada's Simon Fraser University . We are at

4690-700: Was elected as a member of the National Academy of Engineering and the National Academy of Medicine in 2020, and the American Academy of Arts and Sciences in 2021. On August 3, 2023, it was announced that Li was appointed to the United Nations Scientific Advisory Board, established by Secretary-General Antonio Guterres . In 2024, Li made to the Gold House’s most impactful Asian A100 list. Li

4760-581: Was not achieved. The summary statistics given on April 30, 2010: The categories of ImageNet were filtered from the WordNet concepts. Each concept, since it can contain multiple synonyms (for example, "kitty" and "young cat"), so each concept is called a "synonym set" or " synset ". There were more than 100,000 synsets in WordNet 3.0, majority of them are nouns (80,000+). The ImageNet dataset filtered these to 21,841 synsets that are countable nouns that can be visually illustrated. Each synset in WordNet 3.0 has

4830-709: Was promoted to associate professor with tenure in 2012, and then full professor in 2018. At Stanford, Li served as the director of Stanford Artificial Intelligence Lab (SAIL) from 2013 to 2018. She became the founding co-director of Stanford's University-level initiative - the Human-Centered AI Institute, along with co-director Dr. John Etchemendy , former provost of Stanford University. On her sabbatical from Stanford University from January 2017 to fall of 2018, Li joined Google Cloud as its Chief Scientist of AI/ML and Vice President. At Google, her team focused on democratizing AI technology and lowering

4900-714: Was recognized for her work in this area by the International Association for Pattern Recognition in 2016. She delivered a talk on the main stage of TED in Vancouver in 2015, and has since then been viewed more than 2 million times. In recent years, Fei-Fei Li's research work expanded to artificial intelligence in healthcare , collaborating closely with Stanford University School of Medicine professor Arnold Milstein. She has also worked on improving bias in image recognition, for instance by removing concepts with low imageability from ImageNet. She teaches

#88911