首页
公共数据集
机器学习
图像识别
人脸识别
语音识别
文本语料
NLP
标准训练集
行业数据集
机器视觉
医疗图像
自动驾驶
智能交通
能源电力
无人机
框架与工具
国产框架
国外框架
标注工具
其他软件
开放平台
技术开放平台
数据开放平台
测试验证平台
算法教程
关于JSAI
登录
注册
新闻文章/维基百科页面配对
免费
Khan
30
2021-08-24
NLP
点击图片放大查看
资源介绍
阅读一篇简短文章,并选出它和两篇维基百科文章中的哪一篇最接近
END
标签
小规模
文本分类
上一篇
印度新闻标题[Kaggle]
下一篇
NIPS2015 [Kaggle]
发表评论
取消回复
请先
登录
账户再评论哦
会员价:¥0
推荐
原价:¥0
暂无演示
登录下载
登录
最新文章
更多>
Facebook圈子数据集
Google+ 社交圈子数据集
Twitter 社交数据集
爱尔兰时报新闻数据集
新闻类别数据集,包含20万条新闻标题
热门文章
更多>
MIMIC-III("重症监护医疗信息市场")
2021-08-24
163
QASC: A Dataset for Question Answering via Sentence CompositionComposing knowledge from multiple pieces of texts is a key challenge in multi-hop question answering. We present a multi-hop reasoning dataset, Question Answering via Sentence Composition(QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice question. QASC is the first dataset to offer two desirable properties: (a) the facts to be composed are annotated in a large corpus, and (b) the decomposition into these facts is not evident from the question itself. The latter makes retrieval challenging as the system must introduce new concepts or relations in order to discover potential decompositions. Further, the reasoning model must then learn to identify valid compositions of these retrieved facts using common-sense reasoning. To help address these challenges, we provide annotation for supporting facts as well as their composition. Guided by these annotations, we present a two-step approach to mitigate the retrieval challenges. We use other multiple-choice datasets as additional training data to strengthen the reasoning model. Our proposed approach improves over current state-of-the-art language models by 11% (absolute). The reasoning and retrieval problems, however, remain unsolved as this model still lags by 20% behind human performance.
2021-08-24
146
Yahoo!从公开可用网页中提取元数据
2021-08-24
90
The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain
2021-08-24
76
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
2021-08-24
75
标签云
中等规模
#
大规模
#
文本分类
#
小规模
NLG
#
NLP
#
对话系统
#
阅读理解
问答系统
机器翻译
#
文本摘要
#
语义分析
#
结构化数据
目标检测
#
超大规模
#
命名实体识别
#
图像分割
语音识别
目标跟踪
#
医学文本
#
文本翻译
#
信息提取
文本嵌入
#
行人检测
#
情绪分析
#
多模态
情感分析
计算机视觉
#
医疗文本
#
语法
猜你喜欢
MIMIC-III("重症监护医疗信息市场")
QASC: A Dataset for Question Answering via Sentence CompositionComposing knowledge from multiple pieces of texts is a key challenge in multi-hop question answering. We present a multi-hop reasoning dataset, Question Answering via Sentence Composition(QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice question. QASC is the first dataset to offer two desirable properties: (a) the facts to be composed are annotated in a large corpus, and (b) the decomposition into these facts is not evident from the question itself. The latter makes retrieval challenging as the system must introduce new concepts or relations in order to discover potential decompositions. Further, the reasoning model must then learn to identify valid compositions of these retrieved facts using common-sense reasoning. To help address these challenges, we provide annotation for supporting facts as well as their composition. Guided by these annotations, we present a two-step approach to mitigate the retrieval challenges. We use other multiple-choice datasets as additional training data to strengthen the reasoning model. Our proposed approach improves over current state-of-the-art language models by 11% (absolute). The reasoning and retrieval problems, however, remain unsolved as this model still lags by 20% behind human performance.
Yahoo!从公开可用网页中提取元数据
The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
Chinese handwritten digits MNIST dataset
博客作者身份语料库
通过安装残余物来在自然语言推理中未学习数据集偏差
济州岛数据集用于机器翻译和语音合成
Yahoo! Answers Comprehensive Questions and Answers
×
微信扫一扫分享到朋友圈
×
下载图片