11/18/2022 0 Comments Openmoko wiki reader![]() 'input': column elements of a specific row in the table. There are also additional meta-data fields such as 'pageTitle', 'title', 'outputColName', 'url', 'wdcFile'. In the case of multiple choice classification, the 'options' field contains the possible classes that a model needs to choose from. Each task contains several such examples which can be concatenated as a few-shot task. The 'input' field contains several column elements of the same row in the table, while the 'output' field is a target which represents an individual column of the same row. Each example is a dictionary containing a field 'task', which identifies the task, followed by an 'input', 'options', and 'output' field. The intended use of this dataset is to improve few-shot performance by fine-tuning/pre-training on our dataset.Įach task is represented as a jsonline file and consists of several few-shot examples. This implies that our dataset covers a broad range of potential tasks, e.g., multiple-choice, question-answering, table-question-answering, text-classification, etc. The shape of our dataset is very wide, i.e., we have 1000's of tasks, while each task has only a few examples, compared to most current NLP datasets which are very deep, i.e., 10s of tasks with many examples. Since the tables come from the web, the distribution of tasks and topics is very broad. UnpredicTable data subsets based on clustering (for the clustering details please see our publication):
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |