• tanya

NLP Assignment I Sample Assignment

PART-1


The Training data is in the form or a list of dictionaries (jsonl format). Each element in

the list is a distinct article that has an associated summary. The format of each element

of the list is shown below:


{'url': 'http://fortune.com/2014/10/14/luxury-spending-bain/',

'archive':

'http://web.archive.org/web/20141015014008id_/http://fortune.com/2014/10/1

4/luxury-spending-bain/',

'title': 'U.S., with help from Chinese tourists, boosts global luxury

spending',

'date': '20141015014008',

'text': 'Despite a crackdown on graft in China, Russia’s conflict with

the West over Ukraine and Europe’s sick-man economy, global luxury

spending is still on track to rise 5% in 2014, according to a report

released on Tuesday by consulting firm Bain & Co.\n\nBain, whose

projections are closely watched by the luxury industry, also stuck to its

forecast that global luxury goods sales will rise between

4% and 6% a year between 2014 and 2017.\n\nSo with all the insanity going

on in the world right now, what is giving luxury a much needed

boost? The good ol’ U.S. of A., which has become a bigger destination for

rich Chinese travelers (now that getting a visa is easier), and

where millennials are developing a taste for luxury as they progress in

their careers and start to make more money, said Claudia D’Arpizio,

a Bain partner in Milan and leader of the firm’s Global Luxury Goods and

Fashion Practice.\n\n“The U.S. is becoming a more and more important

tourist destination for Chinese travelers,” D’Arpizio told Fortune.

Chinese nationals spend three times more on luxury abroad than they do at

home.\n\nIndeed, Nordstrom JWN , Neiman Marcus and Saks Fifth Avenue are

all opening new Manhattan stores in the coming years to tap luxury’s

ongoing explosion in the U.S., fueled by tourism.\n\nU.S. consumer

confidence hit 14 month highs in September, and high end shopping has been

on a tear this year. The stock market has fallen sharply in the last week,

casting a shadow on consumer spending, but the downturn would need

to last for a while to hit luxury, D’Arpizio said. “We see consumer

confidence that is the most important driver.”\n\nThe uptick in the U.S.

comes at an opportune time for global luxury companies. In July, French

luxury group LVMH reported disappointing first-half results for 2012

while rival Kering, whose brands include Gucci, spoke of an “unsettled

business environment.” Meanwhile, L’Oreal said the U.S. rebound would

help its second-half results.\n\nBain, which conducted the study with

Italian luxury industry group Altagamma, said the global luxury market


is on target to reach 223 billion € ($283 bilion) in 2014, which

represents a 5% bump this year, a slower rate than the 7% last

year.\n\nAnd

there undoubtedly clouds in the luxury sky: Chinese sales have been hurt

by a crackdown by the government against corruption and conspicuous

consumption. And Russian luxury sales are down hurt by a drop in tourism

to Eastern European countries.',

'summary': 'What China, Russian slowdown? Luxury sales keep face with

forecast, Bain says.',

'compression': 30.5625,

'coverage': 0.75,

'density': 1.0,

'compression_bin': 'medium',

'coverage_bin': 'low',

'density_bin': 'abstractive'}


For the purposes of this exercise, you can ignore all keys other than the title, text and summary.

The task is to build multiple models using any deep learning tools that can output summaries for

any text article (title and text).

You must evaluate the models and choose your "winning" model.


PART-2


The NCBI disease corpus is fully annotated at the mention and concept level to serve as

a research resource for the biomedical natural language processing community.

Corpus Characteristics

● 793 PubMed abstracts

● 6,892 disease mentions

● 790 unique disease concepts

○ Medical Subject Headings (MeSH®)

○ Online Mendelian Inheritance in Man (OMIM®)

● 91% of the mentions map to a single disease concept

● divided into training, developing, and testing sets.


Corpus Annotation

● Fourteen annotators

● Two annotators per document (randomly paired)


● Three annotation phases

● Checked for corpus-wide consistency of annotations


The ask is to build a model to predict various Named Entities within the unlabelled text.


For Solution of this assignment or any such assignments contact codersarts

Codersarts is a top-rated website for students which is looking for online Programming Assignment Help, Homework Help, Coursework Help in C, C++, Java, Python, Database, Data structure, Algorithms, Final year project, Android, Web, C sharp, ASP NET to students at all levels whether it is school, college and university level Coursework Help or Real-time project. Hire us and Get your projects done by a computer science expert.
Contact Codersarts for any such project/assignment help


11 views0 comments

Recent Posts

See All