Khan - 第 6 页 - 数据集市

NLP

0 0

产品问题答题系统中幽默检测

此数据集提供产品问题答题系统中标记的幽默检测。数据集包含 3 csv 文件：幽默.csv包含幽默产品问题、非幽默无偏见问题.csv ...

Khan

2021-08-24

NLP

0 0

评论中有用的句子

从客户评论中提取的句子集，标有其帮助分数。

Khan

2021-08-24

NLP

0 0

知识基础对话系统的丰富主题聊天数据集

此数据集在公开发布的专题聊天数据集（https://github.com/alexa/Topical-Chat）的基础上提供额外的注释，这将有助于重现我 ...

Khan

2021-08-24

NLP

0 0

段落内容的离散推理

DROP 数据集包含 96k 对问答（QA），超过 6.7K 段落，在列训练（77k QAs）、开发（9.5k QAs）和隐藏的测试分区（9.5k Q ...

Khan

2021-08-24

DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue

NLP

0 0

DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue

This bucket contains the checkpoints used to reproduce the baseline results reported in the DialoGLUE benchmark host ...

Khan

2021-08-24

NLP

0 0

自动语音识别（ASR）错误稳健性

带有 ASR 错误的句子分类数据数据。

Khan

2021-08-24

NLP

0 0

回答重新制定

原始堆栈交换答案及其语音友好型重新制定。

Khan

2021-08-24

NLP

0 0

亚马逊-PQA

亚马逊产品问题及其答案，以及公共产品信息。

Khan

2021-08-24

NLP

0 0

雷达萨 COVID-19 开放数据

REaltime Data 合成和分析（REDASA） COVID-19 快照包含我们策展人社区制作的策划协议的输出。详细的描述可以在我们的论文 ...

Khan

2021-08-24

NLP

0 0

MIMIC-III（"重症监护医疗信息市场"）

MIMIC-III（"重症监护医疗信息市场"）是一个大型的单中心数据库，包含与大型三级护理医院重症监护室的病人有关的信息。数据 ...

Khan

2021-08-24

NLP

0 0

日本令牌词典

日本令牌词典，用于与MeCab。

Khan

2021-08-24

NLP

0 0

苏达奇语言资源

日语词典和文字嵌入用于自然语言处理。苏达奇迪克是日本令牌（形态分析仪）苏达奇的词典。chiVe是日本预训单词嵌入（单词载 ...

Khan

2021-08-24

NLP

0 0

Common Crawl

A corpus of web crawl data composed of over 50 billion web pages.

Khan

2021-08-24

NLP

0 0

Polish Summaries Corpus (PSC)

Dataset contains news articles and their summaries., lang: Polish, iterations: 723, file_type: TSV, tasks: Summarization

Khan

2021-08-24

NLP

0 0

DNA Methylation Corpus

Dataset contains 200 abstracts including a representative sample of all PubMed citations relevant to DNA methylation ...

Khan

2021-08-24

A Sentiment Analysis Dataset for Code-Mixed Malayalam-English

NLP

0 0

A Sentiment Analysis Dataset for Code-Mixed Malayalam-English

There is an increasing demand for sentiment analysis of text from social media which are mostly code-mixed. Systems ...

Khan

2021-08-24

The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain

NLP

0 0

The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain

This paper presents a new challenging information extraction task in the domain of materials science. We develop an ...

Khan

2021-08-24

GGPONC: A Corpus of German Medical Text with Rich Metadata Based on Clinical Practice Guidelines

NLP

0 0

GGPONC: A Corpus of German Medical Text with Rich Metadata Based on Clinical Practice Guidelines

The lack of publicly accessible text corpora is a major obstacle for progress in natural language processing. For me ...

Khan

2021-08-24

Sinhala Language Corpora and Stopwords from a Decade of Sri Lankan Facebook

NLP

0 0

Sinhala Language Corpora and Stopwords from a Decade of Sri Lankan Facebook

This paper presents two colloquial Sinhala language corpora from the language efforts of the Data, Analysis and Poli ...

Khan

2021-08-24

Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes

NLP

0 0

Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes

As AI systems become an increasing part of people's everyday lives, it becomes ever more important that they underst ...

Khan

2021-08-24

产品问题答题系统中幽默检测

评论中有用的句子

知识基础对话系统的丰富主题聊天数据集

段落内容的离散推理

DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue

自动语音识别 （ASR） 错误稳健性

回答重新制定

亚马逊-PQA

雷达萨 COVID-19 开放数据

MIMIC-III（"重症监护医疗信息市场"）

日本令牌词典

苏达奇语言资源

Common Crawl

Polish Summaries Corpus (PSC)

DNA Methylation Corpus

A Sentiment Analysis Dataset for Code-Mixed Malayalam-English

The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain

GGPONC: A Corpus of German Medical Text with Rich Metadata Based on Clinical Practice Guidelines

Sinhala Language Corpora and Stopwords from a Decade of Sri Lankan Facebook

Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes

自动语音识别（ASR）错误稳健性