Search Results

Blog Posts (737)

Other Pages (191)

Forum Posts (16)

737 results found with an empty search

Data Visualisation Tools
What is Data Visualisation? Data Visualisation is the graphical representation of the data to infer some insights from it. It allows us to comprehend complex relationships within the data. It is also called information graphics. The main goal of data visualisation is to make it easier to identify patterns, trends and outliers in large data sets. It is one of the crucial element in Data Science processes, after the data has been collected, processed and modelled it must be visualised to make conclusions. Data visualisation holds value in other fields as well such as in teaching field teacher can use it to get a visual measure of student's performance, it can be used in businesses to see the trends and make decisions, it can be used by the govt. to keep record of changes in the demographics of a place etc. In today's time there is almost no limit to the use of data visualisation. Why data visualisation? We all know that we are living in a digital era where almost everything has gone digital whether it may be shopping, banking, entertainment or education. With such a large population as consumers, a very large scale of data (of the order of trillions) is generated everyday. It is simply not possible to look at such a data in tabular form and grasp any meaning out of it manually. Therefore many data visualisation tools and techniques have been created that can tell a story based on data which is easier to understand. It is a well known fact that we understand better when we have a clear visual of concepts. Benefits: One can easily find out whether there is a progress or not based on the direction of the trend lines, it will just take a glance for a person to know that. The information can be easily understood by anyone. Making faster decisions with less mistake. There are several ways to visualise data. Different situation would require a different type of visualisation. Here are some of the major types of data visualisation: Line graphs: It is the most basic type of graph and is usually used to display trends or progress over time. It plots a point for each category which is joined by a line. It should be used when the data set is continuous. Bar graphs: It is one of the most common type of graph. It is mainly used for comparison between groups or to display changes over time. Each category on the x-axis is represented by a bar whose height depends on the corresponding value of the y-axis. If you have more than 10 categories to compare then you should make use of bar graphs. Note: A histogram is a special kind of bar graph in which the category is a range of numerical data which is plotted against its frequency. Pie chart: It is used to compare parts of a whole. It is represented by a circle cut up into segments based on the number of parts of the whole. It is more suited to add details to other visualisation. Scatter Plot Chart: It is also called scattergram. It is used to determine relationship between two different data. The x-axis represents one set of data while the y-axis represents the other. If both the data increase at the same time, then it is concluded that there is a positive relation between them and if one decreases and another increases (or vice-versa) at the same time, then it is concluded that there is a negative relation between them. When there is no recognisable pattern, there is no relationship i.e. the variables are independent of each other. A scatter plot can also be used to do cluster analysis and trend lines can be added to enhance it. Heatmaps: A heat map displays the relationship between two or more variables and provides rating information which ranges from high to low, represented by various colours or saturation. It comes in handy when one wants to analyse a variable against a matrix of data. The different shades allows us to easily find out extremes. Area Chart: It is basically a line chart but the area below the line is filled with a colour or a pattern. It is usually used to display changes of multiple variables across time. Box-plot: It is also known as box and whisker plot. It summarises data on an interval. It is a graphical representation of statistical data based on the components: minimum, first quartile, median, third quartile, and maximum. The term “box plot” comes from the fact that the graph looks like a rectangle with lines extending from the top and bottom. These are mainly used in exploratory data analysis. Geographical maps: It is used when one needs to compare a data set against geographic regions. The list of graphs doesn't exhaust so easily, there are a number of graphs available to represent data such as violin plots, stacked bar graphs, doughnut graphs etc. There are a lot of variations of the above mentioned graphs as well. We can style our graphs with different colours, different kind of lines etc. to make it more attractive and pleasing to the eyes. We can also create a dashboard using various graph to tell a story concluded by the data. Moreover, these graphs can also be made dynamic i.e. it updates the result automatically when we update the source data. Now that we are familiar with data visualisation, we will discuss various tools that helps us to make use of above representations. Tableau : Tableau is one of the best and powerful visualisation tool these days. It is widely used in the field of Business Intelligence but it is also utilised in other sectors as well such as research, statistics, various industries etc. It simplifies raw data into understandable format. We can also manipulate data in it to get desired results. It is very easy to use and doesn't require any technical background to work with it. The visualisations are created in the form of Dashboards and Stories. Tableau can be majorly classified into two sections: Developers tool: This consists of the tools that allows us to do the actual work i.e. create dashboards, reports, charts, stories and other visualisations. The products under this category are Tableau Desktop and Tableau Public. Sharing tools: The name is self-explanatory, it helps in sharing the visualisations created using developers tools. The products under this category are Tableau Online, Tableau Server, and Tableau Reader. All in all there are Five products in Tableau: Tableau Desktop ,Tableau Public, Tableau Online, Tableau Server, and Tableau Reader. Let's discuss them one by one: Tableau Desktop: As mentioned earlier, this is where all the major work is done. It has a lot of features that lets you create visualisations very easily. It provides connectivity to Data Warehouse and other file types such as excel, text, json, PDF etc. The visualisation can be stored locally or publicly. Tableau Desktop can be further classified in two parts: Tableau Desktop private: where the workbooks are kept private, and the access is limited. The workbooks cannot be published online, it can only be distributed either Offline or in Tableau Public. Tableau Desktop Professional: Here the only major difference is that the workbooks can be published online and their is full access to all the features. Tableau Public: It is the same as Tableau Desktop, but it is a public version i.e. it is free but the workbooks created cannot be saved locally. they can only be uploaded to Tableau's Public cloud where it can be seen and accessed by everyone. There is no privacy offered in this version. It is best for an individual who wants to learn working with Tableau. Tableau Server: It is essentially used to share the workbooks across the organisation. The work needs to be published in Tableau Desktop first to be able to upload it on the server. Once uploaded, anyone with a license can view the work. Though it isn't necessary for the licensed user to have Tableau Server installed. If a person has valid login credentials then he/she can view the work on a web browser. The admin of the organisation will always have full control over the server. Tableau Online: It is an online sharing tool of Tableau. Its functionalities are similar to Tableau Server, but the data is stored on servers hosted in the cloud which are maintained by the Tableau group. There is no storage limit on the data that can be published. It creates a direct link to over 40 data sources that are hosted in the cloud such as the MySQL, Hive, Amazon Aurora, Spark SQL and many more. To publish, both Tableau Online and Server require the workbooks created by Tableau Desktop. Data that is streamed from the web applications say Google Analytics, Salesforce.com are also supported by Tableau Server and Tableau Online. Tableau Reader: It is tool used to view workbooks created using Tableau Developer tools. It doesn't allow editing and modification in the workbook. Anyone having the workbook can view it using Tableau reader. In fact, if you want to share the dashboards created by you then the receiver needs Tableau Reader to be installed. Tableau has the ability to connect to any platform to extract data. Simple databases such as excel, PDF; and complex databases like Oracle, a database in the cloud such as Amazon web services, Microsoft Azure SQL database, Google Cloud SQL and various other data sources can be extracted by Tableau. On launching it, the Tableau interface provides a list of ready data connectors which allows you to connect to any platform to load the data. Once, the data is pulled it is displayed on the screen. There are sheets in Tableau where the data cab be manipulated and visualisations can be created. Once all the visualisations are created they can be used to make a dashboard or a story to share. The users who receive the dashboards views the file using Tableau Reader. Tableau interface on launching: Tableau workbook screen: The data from the Tableau Desktop can be published to the Tableau server. This is a platform where collaboration, distribution, governance, security model, automation features are supported. With the Tableau server, the end users have a better experience in accessing the files from all locations be it a desktop, mobile or email. Microsoft Excel: We are all familiar with Microsoft Excel and we know that it can easily work with a large number of data in tabular form. It has tools to make data manipulation easy. But this is not the only thing that can be done with Microsoft Excel, it also provides tools for visualisation of data. We can easily plot graphs for a data present in tabular form. The steps are as follows: Select the data to be represented as graphs and click on insert tab in the toolbar: In the Insert tab, under the charts column select the type of chart you want to plot. Here , we will plot a bar chart. We can see that there are variations of bar charts available, similarly many options are available for other chart types as well. One can select any of the one available chart types as per their requirements. Moreover, after plotting the chart we can modify its looks i.e. we can add chart titles, axis titles, we can change the colour scheme of the graphs, we can also change the scale of the graph, all this can be done by using the Design and Format Tab present under Chart tools in the toolbar or by using Chart Elements, Chart Styles and Chart Filters options present along with the chart. Design tab: Format tab: Plotting graphs in excel is hassle free. We can also produce dynamic graphs that can change values based on applied filters. It can be done by making a Pivot charts of the data and adding slicers and timeline to them. Let us see an example: Select the data to be inserted into a Pivot table: In the insert tab, click on the Pivot table option in the Tables column. In the Create Pivot dialog box, select the new worksheet option if you want to create the pivot table in a fresh sheet or you can also choose the existing worksheet option if you want to create the pivot table in the same sheet, in this case you will need to provide the range of cells that would be needed to print to the pivot table. Click OK to create the Pivot table. Select the elements for the rows and values column in PivotTable Fields dialog box to create the Pivot Table. After creating the pivot table. Select Pivot Charts option under Analyze Tab. Select the type of chart you want to plot and click OK. The Chart will be displayed. Select the chart and under the analyse tab select the Insert Slicer option in the filter column. In the Insert Slicers dialog box, check the PivotTable Field for which you want to create a slicer. The slicer will be created. You can select one or multiple elements of the field to display the chart accordingly. In this example we have created a slicer for the field Day, we can use the graph dynamically in the following way. We can also add timeline to the pivot charts if our data consists of date and time. That way we can show the data for a specific date or time with just one click and in the same graph. No need to make different graphs for various cases. Moreover, we can also connect the slicers and timelines to another charts as well given that the other charts have common fields in the same order. Apart from such applications, there are several universal libraries that can be used in any programming language and are specifically made for Data Visualisation. We will now discuss a few of them. Note: It is better to switch to Tableau because it can extract any type of data and can plot a lot of types of graphs. Excel lacks in variety of charts it offer. ggplot2 ,ggplot and Plotnine ggplot2 created by Hadley Wickham in 2005, is a data visualisation library for the programming language R whereas ggplot and Plotnine (created by Hassan Kibirige) are Python implementation of the The Grammar of Graphics (a general scheme for data visualisation which breaks up graphs into semantic components such as scales and layers.), inspired by the interface of the ggplot2 package from R. The basic building blocks according to the Grammar of Graphics are: data The data + a set of aesthetic mappings that describing variables mapping geom Geometric objects, represent what you actually see on the plot: points, lines, polygons, etc. stats Statistical transformations, summarise data in many useful ways. scale The scales map values in the data space to values in an aesthetic space coord A coordinate system, describes how data coordinates are mapped to the plane of the graphic. facet A faceting specification describes how to break up the data into subsets for plotting individual set If one has experience with ggplot2 then they can easily shift to plotnine or ggplot. The main goal of these libraries is to use less coding to produce high quality visuals. It is worth mentioning that ggplot works well with pandas. So, if you're planning on using ggplot, it's best to keep your data in DataFrames. The ggplot2 library can make simple to very complex graphs with univariate or multivariate numerical or categorical data. ggplot2 consists of two functions qplot() (quick plot ) and ggplot(). ggplot and plotnine libraries also use ggplot() function. The qplot() function can hide much of the complexity when creating standard graphs while ggplot() allows maximum features and flexibility. To plot a graph with ggplot(), we must provide three features: Data Aesthetics: it describes how the columns of the data frame can be translated into positions, colors, sizes, and shapes of graphical elements. Geometric objects (geom) Let us look at an example: from ggplot import * diamonds.head() ggplot(diamonds, aes(x='carat', y='price', color='cut')) +\ geom_point() +\ scale_color_brewer(type='diverging', palette=4) +\ xlab("Carats") + ylab("Price") + ggtitle("Diamonds") We used the diamond data set provided in ggplot to make the above plot. You can look at the use of ggplot to create various graphs here: https://yhat.github.io/ggpy/ Matplotlib Matplotlib was introduced by John Hunter in 2002. It is the main visualisation library in Python, all other libraries are built on top of matplotlib. The library itself is huge, with approximately 70,000 total lines of code and is still developing. Typically it is used together with the numerical mathematics extension: NumPy. It contains an interface "pyplot" which is designed to to resemble that of MATLAB. We can plot anything with matplotlib but plotting non-basic can be very complex to implement. Thus, it is advised to use some other higher-level tools when creating complex graphics. Let us take a look at some examples: Square function plot from matplotlib import pyplot as plt data = [x * x for x in range(20)] plt.plot(data) plt.show() Sine function plot import matplotlib import matplotlib.pyplot as plt import numpy as np # Data for plotting t = np.arange(0.0, 2.0, 0.01) s = 1 + np.sin(2 * np.pi * t) fig, ax = plt.subplots() ax.plot(t, s) ax.set(xlabel='time (s)', ylabel='voltage (mV)', title='About as simple as it gets, folks') ax.grid() fig.savefig("test.png") plt.show() Create Path and PathPatch objects through Matplotlib's API import matplotlib.path as mpath import matplotlib.patches as mpatches import matplotlib.pyplot as plt fig, ax = plt.subplots() Path = mpath.Path path_data = [ (Path.MOVETO, (1.58, -2.57)), (Path.CURVE4, (0.35, -1.1)), (Path.CURVE4, (-1.75, 2.0)), (Path.CURVE4, (0.375, 2.0)), (Path.LINETO, (0.85, 1.15)), (Path.CURVE4, (2.2, 3.2)), (Path.CURVE4, (3, 0.05)), (Path.CURVE4, (2.0, -0.5)), (Path.CLOSEPOLY, (1.58, -2.57)), ] codes, verts = zip(*path_data) path = mpath.Path(verts, codes) patch = mpatches.PathPatch(path, facecolor='r', alpha=0.5) ax.add_patch(patch) # plot control points and connecting lines x, y = zip(*path.vertices) line, = ax.plot(x, y, 'go-') ax.grid() ax.axis('equal') plt.show() You can look at the use of matplotlib to create various graphs here: https://matplotlib.org/tutorials/index.html#introductory Seaborn It is a library for creating statistical graphics in Python. It is built on top of matplotlib and integrates closely with pandas data structures. It is considered as a superset of the Matplotlib library and thus is inherently better than matplotlib. Its plots are naturally prettier and easy to customise with colour palettes. The aim of Seaborn is to provide high-level commands to create a variety of plot types that are useful for statistical data exploration, and even some statistical model fitting. It has many built-in complex plots. Let us take a look at some examples: Plotting histogram and density function together import seaborn as sns import pandas as pd import matplotlib.pyplot as plt data = np.random.multivariate_normal([0, 0], [[5, 2], [2, 2]], size=2000) data = pd.DataFrame(data, columns=['x', 'y']) sns.distplot(data['x']) sns.distplot(data['y']); Plotting scatter plot import seaborn as sns import pandas as pd import matplotlib.pyplot as plt sns.set(style="darkgrid") tips = sns.load_dataset("tips") sns.relplot(x="total_bill", y="tip", hue="size", data=tips); You can look at the use of seaborn to create various graphs here: https://seaborn.pydata.org/tutorial.html Working with ggplot is a better option as in a few lines of code we are able to produce complex graphics. Matplotlib and Seaborn should be used together to produce good graphics as seaborn is supposed to be a complement of matplotlib and not a replacement.
Solution Reinforcement Learning for Taxi-v2
There are 4 locations (labeled by different letters), and our job is to pick up the passenger at one location and drop him off at another. We receive +20 points for a successful drop-off and lose 1 point for every time-step it takes. There is also a 10 point penalty for illegal pick-up and drop-off actions. First import all related libraries: import numpy as np import gym import random import pandas as pd import spacy from spacy.tokens import Span from spacy.matcher import PhraseMatcher env = gym.make("Taxi-v3") env.render() Output: Fetching Origing, Destination, and Time of Pickup from the sms data: def fetch_pickup_drop(text_file_path): # Append All the texts in the List texts_list=[] #read the text sms file df=pd.read_csv(text_file_path,header=None,names=['Sms']) #print(df.shape[0]) for i in range(0,df.shape[0]): s=df.iloc[i,0] texts_list.append(s) # add the locations through add_pipe l1=[] LOCATIONS = ["dwarka sector 23", "dwarka sector 21", "hauz khaas", "airport"] nlp = spacy.load('en') matcher = PhraseMatcher(nlp.vocab) matcher.add("LOCATIONS", None, *list(nlp.pipe(LOCATIONS))) def places_component(doc): doc.ents = [Span(doc, start, end, label="GPE") for match_id, start, end in matcher(doc)] return doc nlp.add_pipe(places_component) #last=True #fetch the locations from the texts list of each text and append in the l1 list for doc in nlp.pipe(texts_list): l1.append([(ent.text, ent.label_) for ent in doc.ents]) dest=[] pickup=[] timing=[] for i in range(0,len(texts_list)): str_text=texts_list[i].lower() str1='for '+l1[i][1][0] str2='to '+l1[i][1][0] str3='from '+l1[i][1][0] ## fetch the pickup and drop up location from the texts list of each text sms and append in the destination and pickup list if str1 in str_text or str2 in str_text: dest.append(l1[i][1][0]) pickup.append(l1[i][0][0]) elif str3 in str_text: dest.append(l1[i][0][0]) pickup.append(l1[i][1][0]) # fetch the timing from the texts list of each text and append in the timing list. if 'am' in str_text: new_str=str_text[0:str_text.index('am')-1] n=new_str.rindex(' ') timing.append(new_str[n+1:]+' AM') elif 'pm' in str_text: new_str=str_text[0:str_text.index('pm')-1] n=new_str.rindex(' ') timing.append(new_str[n+1:]+' PM') ## create the dataframe of the pickup, Destination and time of pickup df1 = pd.DataFrame(pickup,columns=['origing']) df2 = pd.DataFrame(dest,columns=['destination']) df3 = pd.DataFrame(timing,columns=['time of pickup']) # concatenate the above three dataframe to get df_final dataframe of sms text file. df_table_final=pd.concat([df1,df2,df3], axis=1) return df_table_final env.reset() # reset environment to a new, random state env.render() action_size = env.action_space.n print("Action size ", action_size) state_size = env.observation_space.n print("State size ", state_size) q_table = np.zeros((state_size, action_size)) print(q_table) Output: Training the Agent: %%time """Training the agent""" import random from IPython.display import clear_output # Hyperparameters alpha = 0.1 gamma = 0.6 epsilon = 0.1 # For plotting metrics all_epochs = [] all_penalties = [] for i in range(1, 100001): state = env.reset() epochs, penalties, reward, = 0, 0, 0 done = False while not done: if random.uniform(0, 1) < epsilon: action = env.action_space.sample() # Explore action space else: action = np.argmax(q_table[state]) # Exploit learned values next_state, reward, done, info = env.step(action) old_value = q_table[state, action] next_max = np.max(q_table[next_state]) new_value = (1 - alpha) * old_value + alpha * (reward + gamma * next_max) q_table[state, action] = new_value if reward == -10: penalties += 1 state = next_state epochs += 1 if i % 100 == 0: clear_output(wait=True) print(f"Episode: {i}") print("Training finished.\n") Output: Episode: 100000 Training finished. CPU times: user 1min 10s, sys: 13.3 s, total: 1min 24s Wall time: 1min 12s Distance between pic up and drop location text_file_path='drive/My Drive/project_2_dataset/sms.txt' df_original=fetch_pickup_drop(text_file_path) print(df_original) # Create a Local Dictionary of city city=pd.read_csv('drive/My Drive/project_2_dataset/city.csv') city['mapping']=city['mapping'].map({0:0., 1:1., 2:2., 3:3.}) loc_dict={city.iloc[0,0]:city.iloc[0,1],city.iloc[1,0]:city.iloc[1,1],city.iloc[2,0]:city.iloc[2,1],city.iloc[3,0]:city.iloc[3,1]} # Change the location by numeric value of city in df_original dataframe df_original['origing']=df_original['origing'].map(loc_dict) df_original['destination']=df_original['destination'].map(loc_dict) Output: origing destination time of pickup 0 airport hauz khaas 3 PM 1 airport hauz khaas 6 PM 2 hauz khaas dwarka sector 23 1 PM 3 airport hauz khaas 1 AM 4 airport dwarka sector 21 10 PM .. ... ... ... 995 airport dwarka sector 23 2 AM 996 dwarka sector 21 dwarka sector 23 2 PM 997 hauz khaas dwarka sector 21 5 AM 998 airport dwarka sector 23 6 PM 999 airport hauz khaas 1 AM [1000 rows x 3 columns] Check Pick up and Drop up correction: #### Check Pick up and Drop up correction def check_pick_up_drop_correction(pick_up, drop, line_num): original_origin = int(df_original.iloc[line_num,0]) original_destination = int(df_original.iloc[line_num,1]) if original_origin == pick_up and original_destination == drop: return True else: return False Evaluate the agent's performance after Q-learning """Evaluate the agent's performance after Q-learning""" total_epochs, total_penalties, wrong_predictions, total_reward = 0, 0, 0, 0 episodes = 1000 for i in range(episodes): epochs, penalties, reward = 0, 0, 0 #Generate the random state from an enviroment and change the pick up and drop as the fetched one state = env.reset() q_table[state][4]=df_original.iloc[i,0] q_table[state][5]=df_original.iloc[i,1] done = False while not done: action = np.argmax(q_table[state,:]) state, reward, done, info = env.step(action) epochs += 1 checking = check_pick_up_drop_correction(int(q_table[state][4]), int(q_table[state][5]), i) if checking == False: wrong_predictions += 1 reward=-10 penalties += 1 else: reward=20 total_penalties += penalties total_epochs += epochs total_reward += reward print(f"Results after {episodes} episodes:") print(f"Average timesteps per episode: {total_epochs / episodes}") print(f"Average penalties per episode: {total_penalties / episodes}") print(f"Total number of wrong predictions", wrong_predictions) print("Total Reward is", total_reward) Output: Results after 1000 episodes: Average timesteps per episode: 196.365 Average penalties per episode: 0.019 Total number of wrong predictions 19 Total Reward is 19430 Contact us to get instant help related to Reinforcement Machine Learning Projects at: contact@codersarts.com
Food Ordering App
INTRODUCTION : - Food Ordering apps are a new rage, from restaurant owners making their own food ordering app for you to deliver food while sitting on your couch to services like zomato, foodpanda, swiggy which act as a collaboration platform between the restaurants and clients. Building a mobile app is easy with the advent of so many tools and technologies, but depending on the kind of app that you’re building, you may need to follow certain guidelines and adhere to some criteria. Food ordering app development is not a linear process, but one that entails a lot of complexity. Research Findings UX (user experience) research is the systematic investigation of users and their requirements, in order to add context and insight into the process of designing the user experience. UX research employs a variety of techniques, tools, and methodologies to reach conclusions, determine facts, and uncover problems, thereby revealing valuable information that can be fed into the design process. User Personas A user persona is a representation of the goals and behavior of a hypothesized group of users. In most cases, personas are synthesized from data collected from interviews with users. Empathy Mapping An Empathy map will help you understand your user’s needs while you develop a deeper understanding of the persons you are designing for. An Empathy Map is just one tool that can help you empathize and synthesize your observations from the research phase, and draw out unexpected insights about your user’s needs. Scenario and storyboard A scenario is a situation that captures how users perform tasks on your site or app. A storyboard is a visual representation of how the user would react with your site or app. Wireframing A wireframe is a low-fidelity, simplified outline of your product. Wireframes are used early in the development process to establish the basic structure of an app before visual design and content is added. In the ideation phase I created wireframes presenting information architecture of the future layout. App user interface design A mobile user interface (mobile UI) is the graphical and usually touch-sensitive display on a mobile device, such as a smartphone or tablet that allows the user to interact with the device’s apps, features, content, and functions. Development After the successful completion of the design part, development was required in order to bring the project to life. Our Android and iOS developers stepped in and provided their best advice on the design and workflow which made the app flawless as it is today. The client received a huge applaud on the design of the project and is satisfied with our services in totality. Problem Definition The mobile aggregator is an application that combines various thematic platforms in order to increase their level of sales and ensure the convenience of the choice of dishes and drinks by users. A distinctive feature of the application is a single design, user-friendly interface. I decided to create competing app where to make an order should be as simple as few clicks on mobile device. And it should be easy to understand and informative about the options and choices the users have. Process I’ve started the process with competitive research and I identified top three competitors. Analyzing and comparing the content of their apps helped me to determine the direction of development. Further, to build empathy with users, I started off with a set of casual interviews. This resulted in a preliminary set of requirements and creating User Personas Interviews helped me to discover list of main requests of the users: • Quality of the service • Good choice of listed restaurants • Delivery/Take away option • Price criterion • Reviews of other users Wireframes How does our food delivery system work for Restaurant? Step 1: Using a customer mobile app, your customers browse your food menu on the smartphone. They select items & quantity to make a food order & pay for the order via mobile app. Upon payment confirmation, they redirected to order confirmation information and now they can track the order from their account. Tracking tree of the order shows all the information like Order received by the restaurant, Order is in preparation, order ready for delivery, order picked up by driver & confirmation on the delivery of the order. Step 2: After order confirmation, Order goes into Order management tablet app which generally placed at the cash counter of a restaurant or in the kitchen. Staff can accept or deny the order with a note. Order management app has two options – Automatic print of new order via wireless thermal printer or can manually print the order via the wireless thermal printer. As soon as the staff accepts the order, it also goes automatically to the nearest available delivery boy which has been signed up with the system. Step 3: During the preparation of the food, Restaurant staff can assign order delivery to the nearest available delivery guy manually or the delivery guy sees the order in his delivery guy app and can assign himself to deliver the order. Step 4: Delivery boy reaches the restaurant and picks the order for delivery to the customer given address. Customers can see live movements of the delivery boy with an estimated time of arrival on the map interface. Step 5: Order will be delivered to customer address and driver mark delivery completion on their delivery guy mobile app. CONCLUSION : - Taking into account all the mentioned details, we can make the conclusion that the food ordering app development requires a professional workforce, time and resources. Careful planning and learning your users’ needs clarifies a lot of important cornerstones. The point is that you need to target not only the clients but restaurants and couriers as well. Covering all their needs is the proven business strategy that focuses on how to make a food ordering app that can become a successful market competitor. An online food ordering system has been a great way to build brands and strengthen businesses. Thus, no exaggeration to conclude that food ordering and delivery has come a long way since its outset and keeps on growing with its features it keeps adding on with every passing day. Hire Figma Experts for any kind of projects – urgent bug fixes, minor enhancement, full time and part time projects, If you need any type project hep, Our expert will help you start designing immediately. Contact us T H A N K Y O U ! !
Disease Detection in Plants.
We will implement Keras to make this program. Keras is a deep learning API written in Python, running on top of the machine learning platform TensorFlow. It was developed with a focus on enabling fast experimentation. Keras follows best practices for reducing cognitive load: it offers consistent & simple APIs, it minimises the number of user actions required for common use cases, and it provides clear & actionable error messages. It also has extensive documentation and developer guides. First we will import all the important libraries and the data set CIFAR-10 from tensorflow.datasets.The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. from tensorflow.keras.datasets import cifar10 from tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten from tensorflow.keras.layers import Conv2D, MaxPooling2D from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, InputLayer from keras.models import Sequential from keras import optimizers from keras.models import Sequential from keras.layers import Dense,Conv2D import tensorflow as tf from keras.applications import vgg16 from keras.models import Model import keras from keras.applications.imagenet_utils import preprocess_input from keras.preprocessing import image import os import cv2 import matplotlib.pyplot as plt import numpy as np import random It can take a lot of time (hours to days) to create neural network models and to train them using the traditional methods. But if one has pre-constructed network structure and pre-trained weights then it may take just a few seconds to do the same. This way, learning outcomes are transferred between different parties.Transfer learning generally refers to a process where a model which is trained on one problem is used in some way on another problem which is relatable. Furthermore, you don’t need to have a large scale training data set once learning outcomes transferred. Inception V3 is a type of CNN (Convolutional Neural Network) which consists of a lot of convolution and max pooling layers. It also contains fully connected neural networks. We don'y need to know its structure by heart to work with it, all that is handled by Keras. We would import Inception V3 and then we will construct a model using it as follows: from keras.applications.inception_v3 import InceptionV3 from keras.applications.inception_v3 import preprocess_input from keras.applications.inception_v3 import decode_predictions from keras.preprocessing import image import numpy as np import matplotlib.pyplot as plt model = InceptionV3(weights='imagenet', include_top=True) print("model structure: ", model.summary()) Now, we have pre-constructed network structure and pre-trained model for imagenet winner model. We can ask anything to Inception V3. Now, we will define a function named classification_v3 which will collect the training data and tell the 3 most probable candidates for each image for categories: Apple__Apple_scab, Apple__Cedar_apple_rust and Apple__Frogeye_Spot. Then, display image and its predictions together. datadir =loc # specify sub folder names in the list below catagories = ['Apple___Apple_scab','Apple___Cedar_apple_rust','Apple_Frogeye_Spot'] img_size = 299 training_data = [] def classification_v3(): for category in catagories: path = os.path.join(datadir,category) classnum = catagories.index(category) for img in os.listdir(path): img_arr = cv2.imread(os.path.join(path,img)) new_arr = cv2.resize(img_arr,(img_size,img_size)) x = np.expand_dims(new_arr, axis = 0) x = preprocess_input(x) features = model.predict(x) print(decode_predictions(features, top = 3)) plt.imshow(image.load_img(os.path.join(path,img))) plt.show() training_data.append([features,classnum]) classification_v3() print(len(training_data)) random.shuffle(training_data) Now that we have constructed a model and collected the training data along with its feature and labels, we will train the model by reshaping and scaling our training data to be fed to the model and defining a function get_features to return the features of the preprocessed training data when called. The main concept is stacking of convolutional layers to create deep neural networks.We used VGG16 (Visual Group Geaometry 16) model to create a neural network layer. import tensorflow as tf x=[] y=[] for feature , label in training_data: x.append(feature) y.append(label) X = np.array(x).reshape(-1,img_size,img_size,3)#1 is for grayscale for bgr/rgb 3 X.shape train_imgs_scaled = X.astype('float32') train_imgs_scaled /= 255 batch_size = 30 num_classes = 5 epochs = 30 input_shape = (150, 150, 3) vgg = tf.keras.applications.InceptionV3( include_top=True, weights='imagenet', input_tensor=None, input_shape=input_shape, pooling=None, classes=1000, classifier_activation='softmax' ) ''' vgg = vgg16.VGG16(include_top=False, weights='imagenet', input_shape=input_shape) ''' output = vgg.layers[-1].output output = keras.layers.Flatten()(output) vgg_model = Model(vgg.input, output) vgg_model.trainable = False for layer in vgg_model.layers: layer.trainable = False def get_features(model, input_imgs): features = model.predict(input_imgs, verbose=0) return features train_features_vgg = get_features(vgg_model, train_imgs_scaled) input_shape = vgg_model.output_shape[1] model = Sequential() model.add(InputLayer(input_shape=(input_shape,))) model.add(Dense(512, activation='relu', input_dim=input_shape)) model.add(Dropout(0.3)) model.add(Dense(512, activation='relu')) model.add(Dropout(0.3)) model.add(Dense(3, activation='softmax')) model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizers.RMSprop(lr=1e-4), metrics=['accuracy']) history = model.fit(x=train_features_vgg, y=y, validation_split=0.3, batch_size=batch_size, epochs=1, verbose=1) The function Sequential( ) is used to group a linear stack of layers into a tf.keras.Model. We defined an instance of Sequential named 'model' and using it we added 3 hidden layers to the neural network. The first two layers have 'relu' as their activation function and the last one has 'softmax' as its activation function. We then combined all the layers using model.compile and trained the model on our training data. We will now check the accuracy and loss of the model and will plot the same. f, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4)) t = f.suptitle('Performance', fontsize=12) f.subplots_adjust(top=0.85, wspace=0.3) epoch_list = list(range(1,31)) ax1.plot(epoch_list, history.history['accuracy'], label='Train Accuracy') ax1.plot(epoch_list, history.history['val_accuracy'], label='Validation Accuracy') ax1.set_xticks(np.arange(0, 31, 5)) ax1.set_ylabel('Accuracy Value') ax1.set_xlabel('Epoch') ax1.set_title('Accuracy') l1 = ax1.legend(loc="best") ax2.plot(epoch_list, history.history['loss'], label='Train Loss') ax2.plot(epoch_list, history.history['val_loss'], label='Validation Loss') ax2.set_xticks(np.arange(0, 31, 5)) ax2.set_ylabel('Loss Value') ax2.set_xlabel('Epoch') ax2.set_title('Loss') l2 = ax2.legend(loc="best") We can see that our model is almost accurate on training data. GitHub link: https://github.com/CodersArts2017/Jupyter-Notebooks/blob/master/plant_village_data_prediction_inception.ipynb
Research Paper Implementation : Recent Advances in Convolutional Neural Networks
ABSTRACT Abstract In the last few years, deep learning has led to very good performance on a variety of problems, such as visual recognition, speech recognition and natural language processing. Among different types of deep neural networks, convolutional neural networks have been most extensively studied. Leveraging on the rapid growth in the amount of the annotated data and the great improvements in the strengths of graphics processor units, the research on convolutional neural networks has been emerged swiftly and achieved stateof-the-art results on various tasks. In this paper, we provide a broad survey of the recent advances in convolutional neural networks. We detailize the improvements of CNN on different aspects, including layer design, activation function, loss function, regularization, optimization and fast computation. Besides, we also introduce various applications of convolutional neural networks in computer vision, speech and natural language processing. Keywords: Convolutional Neural Network, Deep learning. To download full research paper click on the link below. If you need implementation of this research paper or any of its variants, feel free contact us on contact@codersarts.com.
Research Paper Implementation : Research Paper Recommender System Evaluation Using Coverage
ABSTRACT Recommendation systems(RS)support users and developers of various computer and software systems to overcome information overload, perform information discovery tasks and approximate computation, among others. Recommender systems research is frequently based on comparisons of predictive accuracy: the better the evaluation scores, the better the recommender. However, it is difficult to compare results from different recommender systems due to the many options in design and implementation of an evaluation strategy. Additionally, algorithmic implementations can separate from the standard formulation due to manual tuning and modifications that work better in some situations. It have been compared common recommendation algorithms as implemented in three popular recommendation frameworks. We evaluate the quality of recommender systems, most approaches only focus on the predictive accuracy of these systems. Recent works suggest that beyond accuracy there is a variety of other metrics that should be considered when evaluating a RS. This paper reviews a range of evaluation metrics and measures as well as some approaches used for evaluating recommendation systems. Analysis shows that large differences in recommendation accuracy across frameworks and strategies. we are developing the recommender system for research papers using coverage. Key Words Recommender System, Research Paper Recommender System, Evaluation, Metrics, Coverage. To download full research paper click on the link below. If you need implementation of this research paper or any of its variants, feel free contact us on contact@codersarts.com.
Traffic Management System Using Java
Abstract The goal of this assignment is to implement a set of classes and interfaces1 to be used to create a simulation of a traffic management system. You will implement precisely the public and protected items described in the supplied documentation (no extra public/protected members or classes). Private members may be added at your own discretion. Language requirements: Java version 13, JUnit 4 Introduction In this assignment you will finish building a simple simulation of a traffic management system (TMS). A traffic management system monitors traffic flow in a region and adjusts traffic signals to optimise traffic flow. A TMS uses different types of sensors and signals to monitor and manage traffic flow. In the first assignment you implemented the core model for the TMS. In the second assignment you will implement some of the more advanced logic to provide a very simple simulation for the TMS. In addition to the pressure pads and speed cameras from assignment one, you willl add a vehicle count sensor. It counts vehicles passing a location and reports the traffic flow as the number of vehicles in a time period. You need to integrate this new type of sensor into the system. This is an example of a common situation when building a large system. New features need to be added to the system. A well designed system that uses interfaces to define an API means it should be simple to add the new feature. In assignment one, you implemented traffic lights and electronic speed signs and attached them to a route. In assignment two you will provide logic to coordinate traffic lights at intersections. The TMS monitors sensors along routes and manages signals on routes, and at intersections, to optimise traffic flow. In assignment one, the network of routes was implicitly defined by your test code and SimpleDisplay. In assignment two you will implement the logic for the TMS to maintain a network of routes. This includes the ability to load a network from a data file and save a modified network to a file. Monitoring and managing congestion requires sophisticated logic in a real TMS. In assignment one congestion was simply reported by each sensor. In assignment two you will implement logic for congestion calculators. These take the congestion data from a set of sensors and determine overall congestion for the route(s) covered by the sensors. The approach taken is to define a CongestionCalculator interface that provides an API. Different classes can implement this inter- face to provide different options for the logic of determining congestion. This is another example of a common approach to designing flexibility into the system’s structure. When implementing the assignment you need to remember that it is implementing a simulation of the TMS and not the real TMS. Interfaces are provided for the sensors to allow easy replacement of sensor implementations in the program. You will not be collecting data from real sensors but will be implementing classes that demonstrate the behaviour of sensors. They store a set of data values that are used to simulate the sensors returning different values over time. Signals are simple simulations of real signals, in that they only store the current state of the signal and allow the route to update the signal. To manage simulation of time, there is a TimedItem interface and a TimedItemManager class, which you implemented in assignment one. Sensors implement the TimedItem interface, as they are items which need to react to timed events. TimedItemManager stores all the TimedItem ob- jects in the application. The simulation’s GUI tracks time passing in MainView.run() and it invokes MainViewModel.tick() once per second. The tick method calls the TimedItemManager’s oneSecond method, which sends the oneSecond message to all TimedItems. This approach of tracking the passage of time and invoking an action on all relevant objects once per second was the reason that TimedItemManager is implemented as a singleton2 . A simple GUI has been provided to you as part of the provided code. It is in the tms.display package. It will not work until you have implemented the other parts of the assignment that it uses. The GUI has been implemented using JavaFX and consists of three classes and an enum. MainView creates the main window for the TMS GUI. StructureView displays the structure of the traffic network. MainViewModel represents the TMS model that is to be displayed. The TMS application is initialised and started by the Launcher class in the tms package. It loads the traffic network data and creates the GUI. Most of the GUI code has been provided to you. In MainViewModel you need to implement some of the logic that is executed by events in the simulation and to handle keyboard input for the main application’s window. The functionality you need to implement in MainViewModel is to: Save the state of the network to a file in response to the user selecting the save command. This is to be implemented in MainViewModel.save(). Allow the simulation’s execution to be paused and unpaused. This is to be implemented in MainViewModel.togglePaused(). Process time passing in the simulation. This is to be implemented in MainViewModel.tick(). Keyboard input is handled by the accept method in the MainViewModel class. It needs to process input from the user in the main window to perform actions in the simulation. Pressing the ‘P’ key will toggle whether the simulation is paused or not. The ‘Q’ key will quit the simulation. The ‘S’ key will save the current network to a file called “DefaultSave.txt”. A shell for this method has been provided because it is already hooked into the GUI. Persistent Data You need to implement loading a network from a data file. The JavaDoc for the loadNetwork method in the NetworkInitialiser class describes the format of a network data file. Saving a network is done by the save method in the MainViewModel class. A network data file is structured as follows: The first line is the number of intersections (ni) in the file. The second line is the number of routes in the file. The third line is the duration of a yellow light. The following ni lines are the intersection details. The first part of an intersection line is its id. This is optionally followed by a ‘:’, a duration, another ‘:’, and a sequence of intersection ids which are separated by commas. The final set of lines are the route details, including any sensors on the routes. – Each route is on a separate line. The sensors for a route are on the lines immediately after the line for the route. – A route is described by the id of the from intersection, followed by a ‘:’, then the id of the to intersection, followed by a ‘:’, then the default speed for the route, followed by a ‘:’, then the number of sensors on the route, then optionally a ‘:’ and the speed of the electronic speed sign on the route if it has one. – If the route has any sensors, each sensor follows on separate lines. – The first part of a sensor line is its type ‘PP’, ‘SC’ or ‘VC’. This is followed by a ‘:’, then its threshold value, a ‘:’, and then a comma separated list of the data values used to simulate the data returned by the sensor. Any line that starts with a semi-colon ‘;’ is a comment and is to be ignored when reading the data from the file. Attempting to read an invalid network data file should throw an InvalidNetworkException. An example data file, called demo.txt, is provided in your repository in the networks directory. It corresponds to the diagram below. Supplied Material This task sheet. An example network data file. Code specification document (Javadoc) A Subversion repositiory for submitting your assignment called ass2. A simple graphical user interface for the simulation, which is in the display package. A sample solution for the first assignment. You are to use this as the base for your implementation of the second assignment. As the first step in the assignment you should create a new project by checking out the ass2 repository from Subversion. Javadoc Code specifications are an important tool for developing code in collaboration with other people. Although assignments in this course are individual, they still aim to prepare you for writing code to a strict specification by providing a specification document (in Java, this is called Javadoc). You will need to implement the specification precisely as it is described in the specification document. The Javadoc can be viewed in either of the two following ways: 1. Open https://csse2002.uqcloud.net/assignment/2/ in your web browser. Note that this will only be the most recent version of the Javadoc. 2. Navigate to the relevant assignments folder under Assessment on Blackboard and you will be able to download the Javadoc .zip file containing html documentation. Unzip the bundle somewhere, and open docs/index.html with your web browser. Tags in the Javadoc indicate what code has been implemented in assignment one and what code you need to implement in assignment two. Some code from assignment one will need to be modified. There are tags indicating places where you can expect to modify the assignment one code but these are not guaranteed to be all of the places where you may end up modifying code from assignment one. Tasks 1. Implement the classes and methods described in the Javadoc as being requried for assignment two. 2. Implement the indicated features of the user interface. 3. Write JUnit 4 tests for all the methods in the following classes: AveragingCongestionCalculator (in a class called AveragingCongestionCalculatorTest) IntersectionLights (in a class called IntersectionLightsTest) NetworkInitialiser (in a class called NetworkInitialiserTest) Submission Submission is via your Subversion repository. You must ensure that you have committed your code to your repository before the submission deadline. Code that is submitted after the deadline will not be marked. Failure to submit your code through your repository will result in it not being marked. Details for how to submit your assignment are available in the Version Control Guide. Your repository url is: https://source.eait.uq.edu.au/svn/csse2002-s???????/trunk/ass2 — CSSE2002 students or https://source.eait.uq.edu.au/svn/csse7023-s???????/trunk/ass2 — CSSE7023 students Your submission should have the following internal structure: src/ folders (packages) and .java files for classes described in the Javadoc test/ folders (packages) and .java files for the JUnit test classes A complete submission would look like: src/tms/congestion/AveragingCongestionCalculator.java src/tms/congestion/CongestionCalculator.java src/tms/display/ButtonOptions.java src/tms/display/MainView.java src/tms/display/MainViewModel.java src/tms/display/StructureView.java src/tms/intersection/Intersection.java src/tms/intersection/IntersectionLights.java src/tms/network/Network.java src/tms/network/NetworkInitialiser.java src/tms/route/Route.java src/tms/route/SpeedSign.java src/tms/route/TrafficLight.java src/tms/route/TrafficSignal.java src/tms/sensors/DemoPressurePad.java src/tms/sensors/DemoSensor.java src/tms/sensors/DemoSpeedCamera.java src/tms/sensors/DemoVehicleCount.java src/tms/sensors/PressurePad.java src/tms/sensors/Sensor.java src/tms/sensors/SpeedCamera.java src/tms/sensors/VehicleCount.java src/tms/util/DuplicateSensorException.java src/tms/util/IntersectionNotFoundException.java src/tms/util/InvalidNetworkException.java src/tms/util/InvalidOrderException.java src/tms/util/RouteNotFoundException.java src/tms/util/TimedItem.java src/tms/util/TimedItemManager.java src/tms/Launcher.java test/tms/congestion/AveragingCongestionCalculatorTest.java test/tms/intersection/IntersectionLightsTest.java test/tms/network/NetworkInitialiserTest.java test/tms/JdkTest.java Ensure that your assignments correctly declare the package they are within. For example, CongestionCalculator.java should declare package tms.congestion. Do not submit any other files (e.g. no .class files). Note that AveragingCongestionCalculatorTest, IntersectionLightsTest and NetworkInitialiserTest will be compiled without the rest of your files. If you are looking solution of this project assignment then you can contact us at below contact detail, we will also provide other java related technology help: JavaFx, Spring, J2EE, etc. contact@codersarts.com
Research Paper Implementation : New Thinking on, and with, Data Visualization.
ABSTRACT As the complexity and volume of datasets have increased along with the capabilities of modular, open-source, easy-to-implement, visualization tools, scientists’ need for, and appreciation of, data visualization has risen too. Until recently, scientists thought of the “explanatory” graphics created at a research project’s conclusion as “pretty pictures” needed only for journal publication or public outreach. The plots and displays produced during a research project--often intended only for experts--were thought of as a separate category, what we here call “exploratory” visualisation. In this view, discovery comes from exploratory visualisation, and explanatory visualisation is just for communication. Our aim in this paper is to spark conversation amongst scientists, computer scientists, outreach professionals, educators, and graphics and perception experts about how to foster flexible data visualisation practices that can facilitate discovery and communication at the same time. We present an example of a new finding made using the glue visualisation environment to demonstrate how the border between explanatory and exploratory visualisation is easily traversed. The linked-view principles as well as the actual code in glue are easily adapted to astronomy, medicine, and geographical information science--all fields where combining, visualising, and analysing several high-dimensional datasets yields insight. Whether or not scientists can use such a flexible “undisciplined” environment to its fullest potential without special training remains to be seen. We conclude with suggestions for improving the training of scientists in visualization practices, and of computer scientists in the iterative, non-workflow-like, ways in which modern science is carried out. To download full research paper click on the link below. If you need implementation of this research paper or any of its variants, feel free contact us on contact@codersarts.com.
Linear Regression: Boston Housing data set
We will work with Boston housing data set which consists information about houses in Boston. It is provided in scikit-learn library. There are 506 samples and 13 feature variables in this dataset. The objective is to predict the value of prices of the house based on number of rooms. For this we will implement linear regression. First we will load the Boston data set from sklearn.datasets and then we will convert it into a dataframe using pandas so that we can easily work with it. We use NumPy to work with arrays.We will use matplotlib and seaborn to visualise the data. import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from sklearn.datasets import load_boston %matplotlib inline from sklearn import datasets from sklearn.linear_model import LinearRegression from sklearn.model_selection import train_test_split, cross_val_score from sklearn.metrics import mean_squared_error boston= load_boston() boston.keys() Output: dict_keys(['data', 'target', 'feature_names', 'DESCR', 'filename']) data: contains the information for various houses target: prices of the house feature_names: names of the features DESCR: describes the data set bos = pd.DataFrame(boston.data, columns = boston.feature_names) bos['PRICE'] = boston.target bos.head() Now, we will explore the data to understand it better. We will look at the descriptive statistics of the data using the describe( ) function. bos.describe() We will plot a histogram of PRICE feature. sns.set(rc={'figure.figsize':(11.7,8.27)}) plt.hist(bos['PRICE'],color = 'red', bins=30) plt.xlabel("House prices in $1000") plt.show() We observe that the data is distributed normally and that there are only a few outliers. Next, we create a correlation matrix that measures the linear relationships between the variables. It can be done by using the function corr( ). We will use the heatmap function from the seaborn library to plot the correlation matrix. bos_1 = pd.DataFrame(boston.data, columns = boston.feature_names) bos_1['PRICE']=boston.target correlation_matrix = bos_1.corr().round(2) sns.heatmap(data=correlation_matrix, annot=True) The correlation coefficient ranges from -1 to 1. If the value is close to 1, it means that there is a strong positive correlation between the two variables. If it is close to -1, the variables have a strong negative correlation. We prefer to use features with high correlation (whether positive or negative) with our target feature PRICE, to fit to the model. The feature RM has a strong positive correlation with PRICE (0.7) where as LSTAT has a high negative correlation (-0.74). We will draw a scatterplot of RM and LSTAT against PRICE to better visualise the correlation. plt.figure(figsize=(20, 5)) features = ['LSTAT', 'RM'] target = bos['PRICE'] for i, col in enumerate(features): plt.subplot(1, len(features) , i+1) x = bos[col] y = target plt.scatter(x, y,color='green', marker='o') plt.title("Variation in House prices") plt.xlabel(col) plt.ylabel('"House prices in $1000"') Also, an important point in selecting features for a linear regression model is to check for multi-co-linearity. The features RAD, TAX have a correlation of 0.91. These feature pairs are strongly correlated to each other. We should not select both these features together for training the model. Same goes for the features DIS and AGE which have a correlation of -0.75. Since we want to predict prices based on number of rooms only. We will only use RM feature to train on the model and PRICE feature will be our target. X_rooms = bos.RM y_price = bos.PRICE X_rooms = np.array(X_rooms).reshape(-1,1) y_price = np.array(y_price).reshape(-1,1) X_train_1, X_test_1, Y_train_1, Y_test_1 = train_test_split(X_rooms, y_price, test_size = 0.2, random_state=5) We have split the data set into training and test set. So, that we can make sure that our model performs well on unseen data. We fit our training data to LinearRegresion model and train it. We also calculate the RMSE (root mean square error) and R2 score to see how well our model performs. reg_1 = LinearRegression() reg_1.fit(X_train_1, Y_train_1) y_train_predict_1 = reg_1.predict(X_train_1) rmse = (np.sqrt(mean_squared_error(Y_train_1, y_train_predict_1))) r2 = round(reg_1.score(X_train_1, Y_train_1),2) print('RMSE is {}'.format(rmse)) print('R2 score is {}'.format(r2)) print("\n") We get : RMSE is 6.972277149440585 R2 score is 0.43 Our model has been trained. Now, we will use it on test data. reg_1 = LinearRegression() reg_1.fit(X_train_1, Y_train_1) y_train_predict_1 = reg_1.predict(X_train_1) rmse = (np.sqrt(mean_squared_error(Y_train_1, y_train_predict_1))) r2 = round(reg_1.score(X_train_1, Y_train_1),2) print('RMSE is {}'.format(rmse)) print('R2 score is {}'.format(r2)) print("\n") We get: Root Mean Squared Error: 4.895963186952216 R^2: 0.69 We can see that our model worked better on the test set. We will now plot or predictions: prediction_space = np.linspace(min(X_rooms), max(X_rooms)).reshape(-1,1) plt.scatter(X_rooms,y_price) plt.plot(prediction_space, reg_1.predict(prediction_space), color = 'black', linewidth = 3) plt.ylabel('value of house/1000($)') plt.xlabel('number of rooms') plt.show() GitHub Link: https://github.com/CodersArts2017/Jupyter-Notebooks/blob/master/bosten_data_analysis.ipynb
Unsupervised Machine Learning: Classification of Iris data set
The Iris data set is the 'Hello world' in the field of data science. This data sets consists of 3 different types of irises’ (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray. The rows being the samples and the columns being: Sepal Length, Sepal Width, Petal Length and Petal Width. The data set is often used in data mining, classification and clustering examples and to test algorithms. First we will load the Iris data set from sklearn.datasets and then we will convert it into a dataframe using pandas so that we can easily work with it. We will use matplotlib and seaborn to visualise the data. from sklearn.datasets import load_iris import matplotlib.pyplot as plt import pandas as pd import numpy as np import seaborn as sns dataset=load_iris() data=pd.DataFrame(dataset['data'],columns=['Petal length','Petal Width','Sepal Length','Sepal Width']) data['Species']=dataset['target'] data['Species']=data['Species'].apply(lambda x: dataset['target_names'][x]) data.head() Now, we will explore the data to understand it better and to make it suitable to be fed to the machine learning algorithm. We will check for missing values in the data: data.isnull().sum() We see that there are no missing value in this data set. We will check the information about our DataFrame including the index dtype and columns, non-null values and memory usage, by callling the info( ) function. data.info() We can see that there are 150 total rows and there is no missing value. The type of each column is also specified. Now, we will look at the descriptive statistics of the data using the describe( ) function. data.describe() We will now make a pairplot using seaborn to visualize the relationship between the columns of the data frame for different species. sns.pairplot(data,hue='Species') We can see the range of petal length, petal width, sepal length and sepal width for the 3 species easily from the graph. Thus, we can determine to which species a new data point might belong to. We can also visualise our dataframe using a heatmap,a violin plot and a boxplot; as follows: Heatmap: plt.figure(figsize=(10,11)) sns.heatmap(data.corr(),annot=True) plt.plot() Violin plot: plt.figure(figsize=(12,10)) plt.subplot(2,2,1) sns.violinplot(x='Species',y='Sepal Length',data=data) plt.subplot(2,2,2) sns.violinplot(x='Species',y='Sepal Width',data=data) plt.subplot(2,2,3) sns.violinplot(x='Species',y='Petal length',data=data) plt.subplot(2,2,4) sns.violinplot(x='Species',y='Petal Width',data=data) Boxplot: plt.figure(figsize=(12,10)) plt.subplot(2,2,1) sns.boxplot(x='Species',y='Sepal Length',data=data) plt.subplot(2,2,2) sns.boxplot(x='Species',y='Sepal Width',data=data) plt.subplot(2,2,3) sns.boxplot(x='Species',y='Petal length',data=data) plt.subplot(2,2,4) sns.boxplot(x='Species',y='Petal Width',data=data) Now, we need to replace the values of species column with some numerical value so that it will be easier for the machine to understand. This process is called encoding. We will replace 'setosa' with '0', 'versicolor' with '1' and 'virginica' with '2'. from sklearn import preprocessing le = preprocessing.LabelEncoder() le.fit(data['Species']) y =le.transform(data['Species']) y We can see that the species column have been encoded using LabelEncoder from preprocessing class of sklearn library. Now, our data is ready to be trained on different classification models. Here we will be training our data on DecisionTreeClassifier, SupportVectorClassifier, RandomForestClassifier and KNeighboursClassifiers, then we will determine which algorithm worked better. DecisionTreeClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.model_selection import cross_val_score DT = DecisionTreeClassifier(random_state=0) score1=cross_val_score(DT, data[['Petal length','Petal Width','Sepal Length']], y, cv=10) print(score1.mean()) Output: 0.9466666666666667 SupportVectorClassifier from sklearn.svm import SVC svm_clf =SVC(gamma='auto') score2=cross_val_score(svm_clf, data[['Petal length','Petal Width','Sepal Length']], y, cv=10) print(score2.mean()) Output: 0.9533333333333334 RandomForestclassifier from sklearn.ensemble import RandomForestClassifier RFC = RandomForestClassifier(max_depth=2, random_state=0) score3=cross_val_score(RFC, data[['Petal length','Petal Width','Sepal Length']], y, cv=10) print(score3.mean()) Output: 0.9133333333333333 KNeighboursClassifiers from sklearn.neighbors import KNeighborsClassifier KNN = KNeighborsClassifier(n_neighbors=3) score4=cross_val_score(RFC, data[['Petal length','Petal Width','Sepal Length']], y, cv=10) print(score4.mean()) Output: 0.9133333333333333 The output of the above code snippets shows the accuracy of the model. The higher the accuracy the better the model at classifying correctly. We can see that SupportVectorClassifier was the best at classifying correctly followed by DecisionTreeClassifier and then by RandomForestClassifier and KNeighboursClassifiers both of which performed equally. We can't say that this is always true. Different algorithms work better in different situations. So we should always train our data on a number of algorithms and we should then select the best. GitHub link: https://github.com/CodersArts2017/Jupyter-Notebooks/raw/master/IRIS_CONTENT.ipynb
Research Paper Implementation : Machine Translation for Academic Purposes.
ABSTRACT Due to the globalization trend and knowledge boost in the second millennium, multi-lingual translation has become a noteworthy issue. For the purposes of learning knowledge in academic fields, Machine Translation (MT) should be noticed not only academically but also practically. MT should be informed to the translating learners because it is a valuable approach to apply by professional translators for diverse professional fields. For learning translating skills and finding a way to learn and teach through bi-lingual/multilingual translating functions in software, machine translation is an ideal approach that translation trainers, translation learners, and professional translators should be familiar with. In fact, theories for machine translation and computer assistance had been highly valued by many scholars. (e.g., Hutchines, 2003; Thriveni, 2002) Based on MIT’s Open Courseware into Chinese that Lee, Lin and Bonk (2007) have introduced, this paper demonstrates how MT can be efficiently applied as a superior way of teaching and learning. This article predicts the translated courses utilizing MT for residents of global village should emerge and be provided soon in industrialized nations and it exhibits an overview about what the current developmental status of MT is, why the MT should be fully applied for academic purpose, such as translating a textbook or teaching and learning a course, and what types of software can be successfully applied. It implies MT should be promoted in Taiwan because its functions of clearly translating the key-words and leading the basic learners to a certain professional field can be proved in MIT. Keywords Machine Translation, Computational Linguistics, Bi-lingual/Multilingual Translating, Open Courseware. To download full research paper click on the link below. If you need implementation of this research paper or any of its variants, feel free contact us on contact@codersarts.com.
Research Paper Implementation : Autoencoders, Unsupervised Learning, and Deep Architectures.
ABSTRACT Autoencoders play a fundamental role in unsupervised learning and in deep architectures for transfer learning and other tasks. In spite of their fundamental role, only linear autoencoders over the real numbers have been solved analytically. Here we present a general mathematical framework for the study of both linear and non-linear autoencoders. The framework allows one to derive an analytical treatment for the most non-linear autoencoder, the Boolean autoencoder. Learning in the Boolean autoencoder is equivalent to a clustering problem that can be solved in polynomial time when the number of clusters is small and becomes NP complete when the number of clusters is large. The framework sheds light on the different kinds of autoencoders, their learning complexity, their horizontal and vertical composability in deep architectures, their critical points, and their fundamental connections to clustering, Hebbian learning, and information theory. KEYWORDS autoencoders, unsupervised learning, compression, clustering, principal component analysis, boolean, complexity, deep architectures, hebbian learning, information theory. To download full research paper click on the link below. If you need implementation of this research paper or any of its variants, feel free contact us on contact@codersarts.com.