the natural language processing (nlp) com munity has recently seen a growth in corpus-based methods. algorithms light in linguistic theories but rich in available training data have been successfully applied to several applications such as ma chine translation (och and ney 2002), information extraction (etzioni et al 2004), and question an swering (brill et al 2001). in the last decade, we have seen an explosion in the amount of available digital text resources. it is estimated that the internet contains hundreds of terabytes of text data, most of which is in an unstructured format. yet, many nlp algorithms tap into only megabytes or gigabytes of this information. in this paper, we make a step towards acquiring semantic knowledge from terabytes of data. we present an algorithm for extracting is-a relations, designed for the terascale, and compare it to a state of the art method that employs deep analysis of text (pantel and ravichandran 2004). we show that by simply utilizing more data on this task, we can achieve similar performance to a linguisticallyrich approach. the current state of the art co occurrence model requires an estimated 10 years just to parse a 1tb corpus (see table 1). instead of using a syntactically motivated co-occurrence ap proach as above, our system uses lexico-syntactic rules. in particular, it finds lexico-pos patterns by making modifications to the basic edit distance algorithm. once these patterns have been learnt, the algorithm for finding new is-a relations runs in o(n), where n is the number of sentences. in semantic hierarchies such as wordnet (miller 1990), an is-a relation between two words x and y represents a subordinate relationship (i.e. x is more specific than y). many algorithms have recently been proposed to automatically mine is-a (hypo nym/hypernym) relations between words. here, we focus on is-a relations that are characterized by the questions ?what/who is x?? for example, table 2 shows a sample of 10 is-a relations discovered by the algorithms presented in this paper. in this table, we call azalea, tiramisu, and winona ryder in stances of the respective concepts flower, dessert and actress. these kinds of is-a relations would be useful for various purposes such as ontology con struction, semantic information retrieval, question answering, etc. the main contribution of this paper is a comparison of the quality of our pattern-based and co occurrence models as a function of processing time and corpus size. also, the paper lays a foundation for terascale acquisition of knowledge. we will show that, for very small or very large corpora or for situations where recall is valued over precision, the pattern-based approach is best.there is a long standing need for higher quality performance in nlp systems. the natural language processing (nlp) com munity has recently seen a growth in corpus-based methods. our biggest challenge as we venture to the terascale is to use our new found wealth not only to build better systems, but to im prove our understanding of language. we will show that, for very small or very large corpora or for situations where recall is valued over precision, the pattern-based approach is best. also, the paper lays a foundation for terascale acquisition of knowledge. previous approaches to extracting is-a relations fall under two categories: pattern-based and co occurrence-based approaches. re cently, pantel and ravichandran (2004) extended this approach by making use of all syntactic de pendency features for each noun. there is promise for increasing our system accuracy by re ranking the outputs of the top-5 hypernyms. the per formance of the system in the top 5 category is much better than that of wordnet (38%). the focus is on the precision and recall of the systems as a func tion of the corpus size. algorithms light in linguistic theories but rich in available training data have been successfully applied to several applications such as ma chine translation (och and ney 2002), information extraction (etzioni et al 2004), and question an swering (brill et al 2001).