Edit model card

SetFit with sentence-transformers/paraphrase-MiniLM-L3-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-MiniLM-L3-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
Month Number
  • 'Incident.Date.Month: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12'
  • 'bibliography.publication.month: 6, 11, 3, 8, 1, 10, 7, 2, 4, 5, 9, 12'
  • 'mp_month: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12'
Date
  • 'end_date: 12/20/22, 12/19/22, 12/15/22, 12/14/22, 12/13/22, 12/12/22, 12/11/22, 12/7/22, 12/6/22, 12/5/22, 12/4/22, 12/2/22, 11/29/22, 11/22/22, 11/21/22, 11/20/22, 11/19/22, 11/17/22, 11/15/22, 11/14/22'
  • 'STOP_FRISK_DATE: 1/16/2017, 2/8/2017, 2/20/2017, 2/21/2017, 2/17/2017, 2/25/2017, 3/3/2017, 3/16/2017, 3/31/2017, 4/2/2017, 4/4/2017, 3/24/2017, 4/6/2017, 4/18/2017, 5/6/2017, 5/10/2017, 5/17/2017, 5/7/2017, 5/24/2017, 6/8/2017'
  • 'start_date: 10/31/20, 10/30/20, 10/29/20, 10/27/20, 10/23/20, 10/28/20, 10/26/20, 10/24/20, 10/8/20, 10/22/20, 10/19/20, 10/21/20, 10/11/20, 10/20/20, 10/16/20, 10/18/20, 10/15/20, 10/12/20, 10/13/20, 10/14/20'
Categorical
  • 'STOP_LOCATION_PATROL_BORO_NAME: PBMS, (nul, 986, 5 AV, PBMN, PBBX, 238, 233, 1011, 237, PBBS, 991, 154, PBBN, PBQN, PBQS, 183, 1022, 1025, 220'
  • 'Subregion: Western Europe, Italy, Greece, Turkey, Western Asia, Africa (northeastern) and Red Sea, Africa (eastern), Africa (central), Africa (western), Africa (northern), Middle East (western), Middle East (southern), Middle East (eastern), Indian Ocean (western), Indian Ocean (southern), New Zealand, Kermadec Islands, Tonga Islands, Samoan and Wallis Islands, Fiji Islands'
  • 'Procedure.Heart Attack.Quality: Average, Unknown, Worse, Better'
Year
  • 'Year: 1998, 1997, 1996, 1995, 1993, 1994'
  • 'cycle: 2020'
  • 'metadata.acquisition date: 2009, 1924, 2010, 1968, 1982, 1997, 2012, 1983, 1996, 1900, 1990, 1995, 1931, 1960, 1966, 1955, 1993, 1979, 2001, 2011'
Longitude
  • 'Longitude: 2,77228900, 2,77461100, 2,77370600, 2,77423900, 2,77654400, 2,79937600, 2,78064700, 2,77697400, 2,78928200, 2,78032200, 2,77731200, 2,77121300, 2,77167600, 2,78236500, 2,76694300, 2,77139500, 2,76872200, 2,76741500, 2,77156700, 2,82065100'
  • 'long: 40.65531753386127, 35.52146509142811, 41.04610174058556, 37.25718863973695, 37.73038191275334, 38.78755702518432, 36.31538469187874, 38.3542649521305, 40.33741738725765, 36.831052736369664, 37.39711396680899, 38.28297641253209, 40.25037415629944, 39.12501528359793, 40.179108531876246, 38.165405118101205, 40.28234452941448, 37.1590112746327, 40.08056518798263, 38.45329795732872'
  • 'Longitude: 6.85, 2.97, 2.53, -4.02, 10.87, 11.93, 12.7, 14.139, 14.426, 13.897, 14.83, 15.213, 15.064, 14.933, 14.962, 14.999, 12.02, 14.399, 23.336, 24.439'
Floating Point Number
  • 'ins_premium: 784.55, 1053.48, 899.47, 827.34, 878.41, 835.5, 1068.73, 1137.87, 1273.89, 1160.13, 913.15, 861.18, 641.96, 803.11, 710.46, 649.06, 780.45, 872.51, 1281.55, 661.88'
  • 'Data.Vitamins.Vitamin B12: 0.05, 0.56, 0.54, 0.36, 0.61, 0.38, 0.55, 0.58, 0.22, 0.37, 0.46, 0.3, 0.07, 0.5, 0.35, 0.16, 0.24, 0.44, 0.29, 0.85'
  • 'Average Wage Appx MOE: 103382.66673939777, 116573.18275172487, 108787.21048470394, 118363.44825256945, 100311.98088082067, 27560.471912039546, 27835.56534877041, 27020.999829170632, 100720.90656498625, 26279.88466481348, 26033.491463487928, 25918.56067562003, 25523.8518942556, 68486.51231657575, 98614.90773401306, 63965.94799270596, 59979.385203569575, 67171.5207463371, 84156.83712285856, 76854.95603440655'
Slug
  • 'Slug Geography: united-states, iowa, michigan, minnesota, north-dakota, south-dakota, wisconsin, minneapolis-st-paul-bloomington-mn-wi'
  • 'Slug Detailed Occupation: physicians, physicians-surgeons, lawyers-judges-magistrates-other-judicial-workers, medical-health-services-managers, chief-executives-legislators, veterinarians, social-community-service-managers, securities-commodities-financial-services-sales-agents, petroleum-mining-geological-engineers-including-mining-safety-engineers, economists, miscellaneous-social-scientists-including-survey-researchers-sociologists, natural-sciences-managers, geoscientists-and-hydrologists-except-geographers, detectives-criminal-investigators, judicial-law-clerks, other-psychologists, architectural-engineering-managers, education-administrators, astronomers-physicists, public-relations-and-fundraising-managers'
  • 'Slug Geography: united-states, arizona, california, nevada, oregon, los-angeles-long-beach-anaheim-ca, riverside-san-bernardino-ontario-ca, san-diego-carlsbad-ca, san-francisco-oakland-hayward-ca'
U.S. State Abbreviation
  • 'State: AK, AL, AR, AZ, CA, CO, CT, DC, DE, FL, GA, HI, IA, ID, IL, IN, KS, KY, LA, MA'
  • 'abbrev: AL, AK, AZ, AR, CA, CO, CT, DE, DC, FL, GA, HI, ID, IL, IN, IA, KS, KY, LA, ME'
  • 'Facility.State: AL, AK, AZ, AR, CA, CO, CT, DE, DC, FL, GA, HI, ID, IL, IN, IA, KS, KY, LA, ME'
Month Name
  • 'Month: JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC'
  • 'MONTH2: January, February, March, April, May, June, July, August, September, October, November, December'
  • 'MONTH2: January, February, March, April, May, June, July, August, September, October, November, December'
Day of Month
  • 'bibliography.publication.day: 1, 17, 16, 20, 29, 10, 14, 11, 9, 18, 19, 22, 25, 15, 6, 28, 27, 2, 12, 21'
  • 'Date.Day: 26, 24, 31, 7, 14, 21, 28, 5, 12, 19, 2, 9, 16, 23, 30, 4, 11, 18, 25, 1'
  • 'bibliography.publication.day: 1, 17, 16, 20, 29, 10, 14, 11, 9, 18, 19, 22, 25, 15, 6, 28, 27, 2, 12, 21'
Currency Code
  • 'cur_name: AFN, DZD, AOA, ARS, AMD, AZN, BDT, INR, BYR, XOF, BTN, BOB, BIF, KHR, XAF, CVE, CNY, COP, USD, CDF'
Last Name
  • 'answer: Spanberger, Freitas, Eastman, Bacon, Schaeffer, Schupp, Wagner, Schulte, Balter, Katko, Williams, Hale, Spartz, Tucker, Elliott, Hill, Golden, Crafts, Newman, Fricilone'
  • 'candidat: Bush, Perot, Clinton'
Timestamp
  • 'Modification: 26/06/2022 13:31:22, 12/04/2018 15:31:20, 26/06/2022 13:30:09, 26/06/2022 13:30:02, 26/06/2022 13:30:31, 26/06/2022 11:27:12, 26/06/2022 13:30:39, 28/10/2018 00:10:20, 12/04/2018 15:31:19, 26/06/2022 11:26:39, 12/07/2022 09:46:24, 12/04/2018 15:31:18, 21/10/2022 13:07:41, 21/10/2022 13:07:50, 16/09/2020 10:36:33, 26/06/2022 15:36:44, 24/07/2022 09:14:31, 12/04/2018 15:31:17, 26/06/2022 15:36:38, 12/07/2022 09:45:04'
  • 'created_at: 12/21/22 09:28, 12/21/22 12:52, 12/16/22 18:27, 12/16/22 21:10, 12/14/22 10:39, 12/14/22 08:22, 12/15/22 18:31, 12/14/22 14:13, 12/13/22 09:36, 12/14/22 08:23, 12/14/22 15:40, 12/15/22 09:40, 12/7/22 10:47, 12/7/22 08:17, 12/7/22 17:56, 12/15/22 09:50, 11/30/22 09:25, 11/23/22 08:46, 12/1/22 09:39, 12/5/22 08:29'
  • 'created_at: 12/30/20 12:29, 11/2/20 21:26, 11/2/20 22:16, 11/2/20 21:32, 11/2/20 22:01, 11/2/20 22:18, 11/2/20 22:26, 11/2/20 23:31, 11/2/20 21:49, 10/31/20 17:22, 11/1/20 14:39, 11/2/20 08:22, 10/29/20 14:16, 10/31/20 08:36, 10/29/20 11:08, 10/29/20 09:00, 10/29/20 16:13, 10/29/20 16:14, 10/30/20 15:45, 10/28/20 09:24'
Day of Week
  • 'DAY2: Monday, Wednesday, Tuesday, Friday, Saturday, Thursday, Sunday'
  • 'DAY2: Monday, Wednesday, Tuesday, Friday, Saturday, Thursday, Sunday'
  • 'day: Sun, Sat, Thur, Fri'
Integer
  • 'RF: 91.0, 92.0, 88.0, nan, 87.0, 81.0, 84.0, 70.0, 85.0, 83.0, 55.0, 86.0, 66.0, 62.0, 69.0, 82.0, 76.0, 77.0, 79.0, 68.0'
  • 'GK diving: 7.0, 6.0, 9.0, 27.0, 91.0, 15.0, 90.0, 11.0, 10.0, 5.0, 85.0, 13.0, 3.0, 89.0, 84.0, 14.0, 2.0, 88.0, 12.0, 4.0'
  • 'Household Income by Race Moe: 128.0, 286.0, 270.0, 445.0, 390.0, 315.0, 496.0, 734.0, 791.0, 135.0, 266.0, 231.99999999999997, 409.0, 304.0, 326.0, 488.0, 723.0, 531.0, 140.0, 275.0'
Street Address
  • 'STOP_LOCATION_FULL_ADDRESS: 180 GREENWICH STREET, WALL STREET && BROADWAY, 75 GREENE STREET, 429 WEST BROADWAY, WEST STREET && CHAMBERS STREET, CHAMBERS STREET && WEST BROADWAY, CORTLANDT STREET && CHURCH STREET, 111 FULTON STREET, 25 CLIFF STREET, SPRING STREET && AVENUE OF THE AMERICAS, 130 CEDAR STREET, 225 LIBERTY STREET, BARCLAY STREET && WEST STREET, 153 GREENWICH STREET, BATTERY PLACE && STATE STREET, MERCER STREET && BROOME STREET, WEST STREET && CANAL STREET, BROADWAY && PRINCE STREET, WEST BROADWAY && AVENUE OF THE AMERICAS, 3 SOUTH STREET'
  • 'STOP_LOCATION_FULL_ADDRESS: 180 GREENWICH STREET, WALL STREET && BROADWAY, 75 GREENE STREET, 429 WEST BROADWAY, WEST STREET && CHAMBERS STREET, CHAMBERS STREET && WEST BROADWAY, CORTLANDT STREET && CHURCH STREET, 111 FULTON STREET, 25 CLIFF STREET, SPRING STREET && AVENUE OF THE AMERICAS, 130 CEDAR STREET, 225 LIBERTY STREET, BARCLAY STREET && WEST STREET, 153 GREENWICH STREET, BATTERY PLACE && STATE STREET, MERCER STREET && BROOME STREET, WEST STREET && CANAL STREET, BROADWAY && PRINCE STREET, WEST BROADWAY && AVENUE OF THE AMERICAS, 3 SOUTH STREET'
  • 'STOP_LOCATION_FULL_ADDRESS: 180 GREENWICH STREET, WALL STREET && BROADWAY, 75 GREENE STREET, 429 WEST BROADWAY, WEST STREET && CHAMBERS STREET, CHAMBERS STREET && WEST BROADWAY, CORTLANDT STREET && CHURCH STREET, 111 FULTON STREET, 25 CLIFF STREET, SPRING STREET && AVENUE OF THE AMERICAS, 130 CEDAR STREET, 225 LIBERTY STREET, BARCLAY STREET && WEST STREET, 153 GREENWICH STREET, BATTERY PLACE && STATE STREET, MERCER STREET && BROOME STREET, WEST STREET && CANAL STREET, BROADWAY && PRINCE STREET, WEST BROADWAY && AVENUE OF THE AMERICAS, 3 SOUTH STREET'
URL
U.S. State
  • 'Geography: Arizona, California, Nevada, Oregon'
  • 'State: Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, District of Columbia, Florida, Georgia, Hawaii, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine'
  • 'Slug Geography: california'
Zip Code
  • 'recipient_zip: 995084442, 99503, 995163436, 352124572, 35216, 35976, 358021277, 352174710, 35203, 35233, 35805, 72716, 72201, 72035, 72015, 72223, 72019, 72113, 72758, 72227'
  • 'STOP_LOCATION_ZIP_CODE: (null), 20292, AVENUE, 5 AVEN, 10019, 22768, 10035, 10026, 10128, 24231, 10030, 10039, 23874, 11213, 11233, 100652, 10451, 23543, 100745, PROSPE'
  • 'zip_codes: nan, 12081.0, 10090.0, 12423.0, 12420.0'
Country Name
  • 'Nationality: Portugal, Argentina, Brazil, Uruguay, Germany, Poland, Spain, Belgium, Chile, Croatia, Wales, Italy, Slovenia, France, Gabon, Sweden, Netherlands, Denmark, Slovakia, England'
  • 'adm0_name: Afghanistan, Algeria, Angola, Argentina, Armenia, Azerbaijan, Bangladesh, Bassas da India, Belarus, Benin, Bhutan, Bolivia, Burkina Faso, Burundi, Cambodia, Cameroon, Cape Verde, Central African Republic, Chad, China'
  • 'Geography: United States'
Boolean
  • 'ranked_choice_reallocated: False, True'
  • 'wealth.how.was founder: True'
  • 'internal: False'
Short text
  • 'agestr: Under 5, 5 to 9, 10 to 14, 15 to 19, 20 to 24, 25 to 29, 30 to 34, 35 to 39, 40 to 44, 45 to 49, 50 to 54, 55 to 59, 60 to 64, 65 to 69, 70 to 74, 75 to 79, 80 to 84'
  • "Show.Name: Tru, Miss Saigon, A Streetcar Named Desire 92, The Sisters Rosensweig, Beauty And The Beast, A Little More Magic, Broken Glass, Show Boat, Sunset Boulevard, The Shadow Box, Uncle Vanya, Smokey Joe'S Cafe, Having Our Say, Hamlet 95, The Rose Tattoo, A Month In The Country, Arcadia, Cats, Chronicle Of A Death Foretold, How To Succeed In Business Without Really Trying"
  • 'Club: Real Madrid CF, FC Barcelona, Paris Saint-Germain, FC Bayern Munich, Manchester United, Chelsea, Juventus, Manchester City, Arsenal, Atlético Madrid, Borussia Dortmund, Milan, Tottenham Hotspur, Napoli, Inter, Liverpool, Roma, BeÅŸiktaÅŸ JK, AS Monaco, Bayer 04 Leverkusen'
Occupation
  • 'Detailed Occupation: Other managers, Cashiers, Retail salespersons, Driver/sales workers & truck drivers, Registered nurses'
  • 'Detailed Occupation: Physicians, Physicians & surgeons, Lawyers, & judges, magistrates, & other judicial workers, Medical & health services managers, Chief executives & legislators, Veterinarians, Social & community service managers, Securities, commodities, & financial services sales agents, Petroleum, mining & geological engineers, including mining safety engineers, Economists, Miscellaneous social scientists, including survey researchers & sociologists, Natural sciences managers, Geoscientists and hydrologists, except geographers, Detectives & criminal investigators, Judicial law clerks, Other psychologists, Architectural & engineering managers, Education administrators, Astronomers & physicists, Public relations and fundraising managers'
  • 'occupation: Operatives, Craftsmen, Sales, Other, Managers/admin, Professional/technical, Clerical/unskilled, Laborers, Transport, Service, nan, Household workers, Farm laborers, Farmers'
Partial timestamp
  • 'Last Known Eruption: 8300 BCE, 4040 BCE, Unknown, 3600 BCE, 1282 CE, 104 BCE, 1538 CE, 1944 CE, 1302 CE, 8040 BCE, 2019 CE, 1230 CE, 1890 CE, 1867 CE, 1891 CE, 1050 BCE, 258 BCE, 140 CE, 1950 CE, 1888 CE'
  • 'bibliography.publication.full: June, 1998, November, 1999, March, 1994, June 17, 2008, August 16, 2005, August 20, 2006, August 29, 2006, January 10, 2006, March, 2001, June, 2001, October 14, 1892, July, 1998, July, 2003, January, 1994, October 1997, August 16, 2013, February 11, 2006, June 9, 2008, January 1, 1870, April, 2001'
  • 'Rating.Experience: Below, Same, None, Above'
Street Name
  • 'STOP_LOCATION_STREET_NAME: GREENWICH STREET, WALL STREET, GREENE STREET, WEST BROADWAY, WEST STREET, CHAMBERS STREET, CORTLANDT STREET, FULTON STREET, CLIFF STREET, SPRING STREET, CEDAR STREET, LIBERTY STREET, BARCLAY STREET, BATTERY PLACE, MERCER STREET, BROADWAY, SOUTH STREET, THOMPSON STREET, JAY STREET, CHURCH STREET'
  • 'STOP_LOCATION_STREET_NAME: GREENWICH STREET, WALL STREET, GREENE STREET, WEST BROADWAY, WEST STREET, CHAMBERS STREET, CORTLANDT STREET, FULTON STREET, CLIFF STREET, SPRING STREET, CEDAR STREET, LIBERTY STREET, BARCLAY STREET, BATTERY PLACE, MERCER STREET, BROADWAY, SOUTH STREET, THOMPSON STREET, JAY STREET, CHURCH STREET'
  • 'STOP_LOCATION_STREET_NAME: GREENWICH STREET, WALL STREET, GREENE STREET, WEST BROADWAY, WEST STREET, CHAMBERS STREET, CORTLANDT STREET, FULTON STREET, CLIFF STREET, SPRING STREET, CEDAR STREET, LIBERTY STREET, BARCLAY STREET, BATTERY PLACE, MERCER STREET, BROADWAY, SOUTH STREET, THOMPSON STREET, JAY STREET, CHURCH STREET'
Full Name
  • "cand_nm: Rubio, Marco, Santorum, Richard J., Perry, James R. (Rick), Carson, Benjamin S., Cruz, Rafael Edward 'Ted', Paul, Rand, Clinton, Hillary Rodham, Sanders, Bernard, Fiorina, Carly, Huckabee, Mike, Pataki, George E., O'Malley, Martin Joseph, Graham, Lindsey O., Bush, Jeb, Trump, Donald J., Jindal, Bobby, Christie, Christopher J., Walker, Scott, Stein, Jill, Webb, James Henry Jr."
  • 'sponsor_candidate: None, Vern Buchanan, Joyce Ann Elliott, Xochitl Torres Small, Desiree Tims, Morris Durham Davis, John Katko, Stephen Daniel, Nancy Mace, Alaina Shearer, Wesley Hunt, Scott Perry, J.D. Scholten, Jim Bognet, Angie Craig, Brynne S. Kennedy, Young Kim, Ammar Campa-Najjar, Donna E. Shalala, Jennifer T. Wexton'
  • 'bibliography.author.name: Austen, Jane, Gilman, Charlotte Perkins, Carroll, Lewis, Shelley, Mary Wollstonecraft, Kafka, Franz, Twain, Mark, Wilde, Oscar, Douglass, Frederick, Ibsen, Henrik, Melville, Herman, Doyle, Arthur Conan, Dickens, Charles, Joyce, James, Swift, Jonathan, Stoker, Bram, Machiavelli, Niccolo, Tolstoy, Leo, graf, Grimm, Wilhelm, Vatsyayana, Unknown'
Very short text
  • 'above_ground_sighter_measurement: None, FALSE, 4, 3, 30, 10, 6, 24, 8, 25, 5, 50, 70, 12, 2, 20, 7, 13, 15, 28'
  • 'status: N, Y, REMOVE, None, 1, ?, H, R, M, T'
  • 'review_reason_code: 2, 1, 4, None, 5, 3, 7, 3?, 8, D, ?, 3, 1, 1 or 2, D or 1, 7B, 1, 2, 1 OR 2, D OR 2, B, 4?'
URI
Latitude
  • 'Latitude: 48,87217700, 48,85543800, 48,87416100, 48,87322500, 48,87422500, 48,84189000, 48,86617200, 48,87112100, 48,86552200, 48,87623100, 48,85609000, 48,85642700, 48,86853300, 48,87465400, 48,86995000, 48,85654000, 48,87022000, 48,86962600, 48,85663200, 48,83476200'
  • 'Latitude: 50.17, 45.775, 42.17, 38.87, 43.25, 42.6, 41.73, 40.827, 40.821, 40.73, 39.48, 38.789, 38.638, 38.49, 38.404, 37.748, 37.1, 36.77, 39.284, 37.615'
  • 'lat: 40.7940823884086, 40.7948509408039, 40.7667178072558, 40.7697032606755, 40.797533370163, 40.7902561000937, 40.7693045133578, 40.7942883045566, 40.7729752391435, 40.7903128889029, 40.7762126854894, 40.7725908847499, 40.7931811701082, 40.7917367820255, 40.7829723919744, 40.7742879599026, 40.7823507678183, 40.7919669739962, 40.7702795904962, 40.7698124821507'
Time
  • 'STOP_FRISK_TIME: 14:26:00, 11:10:00, 11:35:00, 13:20:00, 21:25:00, 20:00:00, 19:58:00, 13:15:00, 8:16:00, 18:44:00, 22:30:00, 4:45:00, 18:30:00, 0:00:00, 9:58:00, 11:15:00, 13:00:00, 8:00:00, 14:57:00, 4:15:00'
  • 'STOP_FRISK_TIME: 14:26:00, 11:10:00, 11:35:00, 13:20:00, 21:25:00, 20:00:00, 19:58:00, 13:15:00, 8:16:00, 18:44:00, 22:30:00, 4:45:00, 18:30:00, 0:00:00, 9:58:00, 11:15:00, 13:00:00, 8:00:00, 14:57:00, 4:15:00'
  • 'STOP_FRISK_TIME: 14:26:00, 11:10:00, 11:35:00, 13:20:00, 21:25:00, 20:00:00, 19:58:00, 13:15:00, 8:16:00, 18:44:00, 22:30:00, 4:45:00, 18:30:00, 0:00:00, 9:58:00, 11:15:00, 13:00:00, 8:00:00, 14:57:00, 4:15:00'
Postal Code
  • 'Code postal: 77700.0, nan'
Country ISO Code
  • 'Runner-up Nationality: AUS, GBR, NZL, FRA, USA, RSA, CZE, ARG, GER, SUI, ESP, CRO, ROM, DEN, TCH, URS, CZ, SRB, CND, SWE'
  • 'Champion Nationality: AUS, FRA, GBR, NZL, USA, SRB, SUI, SWE, CZE, ESP, GER, NED, CRO, BRA, RUS'
First Name
  • 'Top Name: Mary, Linda, Debra, Lisa, Michelle, Jennifer, Jessica, Samantha, Ashley, Hannah, Emily, Madison, Emma, Isabella, Sophia, Olivia, John, Robert, James, David'
City Name
  • 'Incident.Location.City: Shelton, Aloha, Wichita, San Francisco, Evans, Guthrie, Chandler, Assaria, Burlington, Knoxville, Stockton, Freeport, Columbus, Des Moines, New Orleans, Huntley, Salt Lake City, Strong, Syracuse, England'
Color
  • 'color: Yellow, Black, White'
  • 'highlight_fur_color: None, Cinnamon, White, Gray, Cinnamon, White, Gray, White, Black, Cinnamon, White, Black, Black, White, Black, Cinnamon, Gray, Black'
  • 'primary_fur_color: None, Gray, Cinnamon, Black'
License Plate
  • 'plate: AZIZ714, BATBOX1, BBOMBS, BEACHY1, BLK PWR5, BOT TAK, CHERIPI, CIO FTW, DAVES88, DMOBGFY, DOITFKR, EGGPUTT, F DIABDZ, FJ 666, FKK OFF, FKN BLAK, FLT ATCK, F LUPUS, HVNNHEL, H8DES'
AM/PM
  • 'shift: PM, AM'
Company Name
  • "company.name: Microsoft, Berkshire Hathaway, Telmex, F. Hoffmann-La Roche, Zara, Henderson Land Development, Oracle, Lin Yuan Group, Aldi, Sun Hung Kai Properties, Kingdom Holding Company, Koch industries, Cheung king, Walmart, Seibu Corporation, Las Vegas Sands, Aldi Nord, Tetra Pak, BMW, L'Oreal"
Secondary Address
  • 'STOP_LOCATION_APARTMENT: (null), 2, 7, 4TH, 2FL, ROOF, ROOF T, BASEME, LOBBY, 17TH, 2 FLOO, 12, 1701, HALLWA, 1E, 5D, SIDEWA, FRONT, 12C, None'

Evaluation

Metrics

Label Accuracy
all 0.7512

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("quantisan/paraphrase-MiniLM-L3-v2-93dataset")
# Run inference
preds = model("variety: Western, Eastern")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 22.3314 85
Label Training Sample Count
Categorical 8
Timestamp 5
Date 8
Integer 8
Partial timestamp 4
Short text 8
Very short text 3
AM/PM 1
Boolean 8
City Name 1
Color 3
Company Name 1
Country ISO Code 2
Country Name 8
Currency Code 1
Day of Month 4
Day of Week 4
First Name 1
Floating Point Number 8
Full Name 8
Last Name 2
Latitude 4
License Plate 1
Longitude 4
Month Name 6
Month Number 4
Occupation 3
Postal Code 1
Secondary Address 1
Slug 8
Street Address 3
Street Name 3
Time 3
U.S. State 8
U.S. State Abbreviation 6
URI 1
URL 8
Year 8
Zip Code 4

Training Hyperparameters

  • batch_size: (8, 8)
  • num_epochs: (4, 4)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0003 1 0.3882 -
0.0140 50 0.1864 -
0.0280 100 0.1588 -
0.0421 150 0.15 -
0.0561 200 0.1537 -
0.0701 250 0.1325 -
0.0841 300 0.132 -
0.0981 350 0.1149 -
0.1121 400 0.1198 -
0.1262 450 0.1035 -
0.1402 500 0.0907 -
0.1542 550 0.0917 -
0.1682 600 0.0875 -
0.1822 650 0.0803 -
0.1962 700 0.0669 -
0.2103 750 0.0671 -
0.2243 800 0.0614 -
0.2383 850 0.0642 -
0.2523 900 0.0481 -
0.2663 950 0.0548 -
0.2803 1000 0.0346 -
0.2944 1050 0.0406 -
0.3084 1100 0.0403 -
0.3224 1150 0.0349 -
0.3364 1200 0.0312 -
0.3504 1250 0.0378 -
0.3645 1300 0.0335 -
0.3785 1350 0.0323 -
0.3925 1400 0.0234 -
0.4065 1450 0.0313 -
0.4205 1500 0.022 -
0.4345 1550 0.0326 -
0.4486 1600 0.0233 -
0.4626 1650 0.0195 -
0.4766 1700 0.0254 -
0.4906 1750 0.0211 -
0.5046 1800 0.0198 -
0.5186 1850 0.0201 -
0.5327 1900 0.0216 -
0.5467 1950 0.0174 -
0.5607 2000 0.0176 -
0.5747 2050 0.0234 -
0.5887 2100 0.0172 -
0.6027 2150 0.0129 -
0.6168 2200 0.0151 -
0.6308 2250 0.015 -
0.6448 2300 0.0164 -
0.6588 2350 0.0137 -
0.6728 2400 0.014 -
0.6869 2450 0.0154 -
0.7009 2500 0.0135 -
0.7149 2550 0.0164 -
0.7289 2600 0.0139 -
0.7429 2650 0.0164 -
0.7569 2700 0.0106 -
0.7710 2750 0.0084 -
0.7850 2800 0.0133 -
0.7990 2850 0.0114 -
0.8130 2900 0.0066 -
0.8270 2950 0.0091 -
0.8410 3000 0.0126 -
0.8551 3050 0.0107 -
0.8691 3100 0.0068 -
0.8831 3150 0.006 -
0.8971 3200 0.007 -
0.9111 3250 0.0155 -
0.9251 3300 0.0111 -
0.9392 3350 0.0049 -
0.9532 3400 0.0076 -
0.9672 3450 0.0092 -
0.9812 3500 0.0086 -
0.9952 3550 0.0061 -
1.0 3567 - 0.1341
1.0093 3600 0.0073 -
1.0233 3650 0.0065 -
1.0373 3700 0.0063 -
1.0513 3750 0.0094 -
1.0653 3800 0.0114 -
1.0793 3850 0.0084 -
1.0934 3900 0.0098 -
1.1074 3950 0.0058 -
1.1214 4000 0.0045 -
1.1354 4050 0.018 -
1.1494 4100 0.0077 -
1.1634 4150 0.0067 -
1.1775 4200 0.0061 -
1.1915 4250 0.0037 -
1.2055 4300 0.0045 -
1.2195 4350 0.0033 -
1.2335 4400 0.0067 -
1.2475 4450 0.0054 -
1.2616 4500 0.0057 -
1.2756 4550 0.004 -
1.2896 4600 0.0033 -
1.3036 4650 0.0076 -
1.3176 4700 0.0045 -
1.3317 4750 0.0068 -
1.3457 4800 0.0043 -
1.3597 4850 0.0049 -
1.3737 4900 0.0045 -
1.3877 4950 0.0055 -
1.4017 5000 0.0065 -
1.4158 5050 0.0029 -
1.4298 5100 0.0041 -
1.4438 5150 0.0064 -
1.4578 5200 0.0031 -
1.4718 5250 0.0078 -
1.4858 5300 0.0031 -
1.4999 5350 0.004 -
1.5139 5400 0.0035 -
1.5279 5450 0.0062 -
1.5419 5500 0.0062 -
1.5559 5550 0.0065 -
1.5699 5600 0.0036 -
1.5840 5650 0.0037 -
1.5980 5700 0.0047 -
1.6120 5750 0.0037 -
1.6260 5800 0.0028 -
1.6400 5850 0.0052 -
1.6541 5900 0.0043 -
1.6681 5950 0.0029 -
1.6821 6000 0.0064 -
1.6961 6050 0.0031 -
1.7101 6100 0.0023 -
1.7241 6150 0.002 -
1.7382 6200 0.0041 -
1.7522 6250 0.0033 -
1.7662 6300 0.0043 -
1.7802 6350 0.0023 -
1.7942 6400 0.0036 -
1.8082 6450 0.0024 -
1.8223 6500 0.0016 -
1.8363 6550 0.003 -
1.8503 6600 0.0043 -
1.8643 6650 0.0043 -
1.8783 6700 0.0017 -
1.8923 6750 0.0018 -
1.9064 6800 0.0029 -
1.9204 6850 0.0026 -
1.9344 6900 0.0039 -
1.9484 6950 0.0019 -
1.9624 7000 0.0041 -
1.9765 7050 0.0019 -
1.9905 7100 0.0023 -
2.0 7134 - 0.1286
2.0045 7150 0.0016 -
2.0185 7200 0.0017 -
2.0325 7250 0.0016 -
2.0465 7300 0.0019 -
2.0606 7350 0.0015 -
2.0746 7400 0.0016 -
2.0886 7450 0.0015 -
2.1026 7500 0.0015 -
2.1166 7550 0.0034 -
2.1306 7600 0.0043 -
2.1447 7650 0.0016 -
2.1587 7700 0.0016 -
2.1727 7750 0.0015 -
2.1867 7800 0.0015 -
2.2007 7850 0.0017 -
2.2147 7900 0.0013 -
2.2288 7950 0.0016 -
2.2428 8000 0.0013 -
2.2568 8050 0.0039 -
2.2708 8100 0.0053 -
2.2848 8150 0.0025 -
2.2989 8200 0.0015 -
2.3129 8250 0.0012 -
2.3269 8300 0.006 -
2.3409 8350 0.0014 -
2.3549 8400 0.0014 -
2.3689 8450 0.0028 -
2.3830 8500 0.0015 -
2.3970 8550 0.0019 -
2.4110 8600 0.0017 -
2.4250 8650 0.002 -
2.4390 8700 0.0016 -
2.4530 8750 0.0014 -
2.4671 8800 0.0021 -
2.4811 8850 0.0012 -
2.4951 8900 0.0015 -
2.5091 8950 0.0012 -
2.5231 9000 0.0012 -
2.5371 9050 0.0016 -
2.5512 9100 0.0016 -
2.5652 9150 0.0013 -
2.5792 9200 0.0028 -
2.5932 9250 0.0013 -
2.6072 9300 0.0011 -
2.6213 9350 0.0035 -
2.6353 9400 0.0013 -
2.6493 9450 0.0012 -
2.6633 9500 0.0037 -
2.6773 9550 0.0012 -
2.6913 9600 0.0011 -
2.7054 9650 0.0037 -
2.7194 9700 0.0012 -
2.7334 9750 0.0013 -
2.7474 9800 0.0013 -
2.7614 9850 0.001 -
2.7754 9900 0.0011 -
2.7895 9950 0.0012 -
2.8035 10000 0.0012 -
2.8175 10050 0.001 -
2.8315 10100 0.001 -
2.8455 10150 0.0011 -
2.8595 10200 0.0009 -
2.8736 10250 0.0018 -
2.8876 10300 0.0013 -
2.9016 10350 0.0009 -
2.9156 10400 0.0033 -
2.9296 10450 0.0034 -
2.9437 10500 0.0011 -
2.9577 10550 0.0013 -
2.9717 10600 0.0009 -
2.9857 10650 0.0009 -
2.9997 10700 0.0011 -
3.0 10701 - 0.1205
3.0137 10750 0.0009 -
3.0278 10800 0.0009 -
3.0418 10850 0.0032 -
3.0558 10900 0.0008 -
3.0698 10950 0.0013 -
3.0838 11000 0.0033 -
3.0978 11050 0.0011 -
3.1119 11100 0.0008 -
3.1259 11150 0.0009 -
3.1399 11200 0.0008 -
3.1539 11250 0.0033 -
3.1679 11300 0.0032 -
3.1819 11350 0.0008 -
3.1960 11400 0.0008 -
3.2100 11450 0.001 -
3.2240 11500 0.0009 -
3.2380 11550 0.0008 -
3.2520 11600 0.0008 -
3.2660 11650 0.0008 -
3.2801 11700 0.0009 -
3.2941 11750 0.0008 -
3.3081 11800 0.0007 -
3.3221 11850 0.0008 -
3.3361 11900 0.0008 -
3.3502 11950 0.0009 -
3.3642 12000 0.0008 -
3.3782 12050 0.0007 -
3.3922 12100 0.0009 -
3.4062 12150 0.0008 -
3.4202 12200 0.0008 -
3.4343 12250 0.0009 -
3.4483 12300 0.0008 -
3.4623 12350 0.0008 -
3.4763 12400 0.0008 -
3.4903 12450 0.0009 -
3.5043 12500 0.0007 -
3.5184 12550 0.0008 -
3.5324 12600 0.0009 -
3.5464 12650 0.0031 -
3.5604 12700 0.0009 -
3.5744 12750 0.0008 -
3.5884 12800 0.0007 -
3.6025 12850 0.0007 -
3.6165 12900 0.0007 -
3.6305 12950 0.0008 -
3.6445 13000 0.0007 -
3.6585 13050 0.0008 -
3.6726 13100 0.0007 -
3.6866 13150 0.0007 -
3.7006 13200 0.0008 -
3.7146 13250 0.0007 -
3.7286 13300 0.0031 -
3.7426 13350 0.0006 -
3.7567 13400 0.0008 -
3.7707 13450 0.0007 -
3.7847 13500 0.0006 -
3.7987 13550 0.0007 -
3.8127 13600 0.0008 -
3.8267 13650 0.0007 -
3.8408 13700 0.0008 -
3.8548 13750 0.0007 -
3.8688 13800 0.0007 -
3.8828 13850 0.0007 -
3.8968 13900 0.0007 -
3.9108 13950 0.0007 -
3.9249 14000 0.0031 -
3.9389 14050 0.003 -
3.9529 14100 0.0007 -
3.9669 14150 0.0007 -
3.9809 14200 0.0007 -
3.9950 14250 0.0007 -
4.0 14268 - 0.1155

Framework Versions

  • Python: 3.11.10
  • SetFit: 1.1.0
  • Sentence Transformers: 3.2.0
  • Transformers: 4.45.2
  • PyTorch: 2.4.1+cu124
  • Datasets: 3.0.1
  • Tokenizers: 0.20.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
5
Safetensors
Model size
17.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for quantisan/paraphrase-MiniLM-L3-v2-93dataset

Finetuned
(19)
this model

Evaluation results