Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

Modern Information Research and Processing Technologies. MFJ-101. Practical Task 3. Data Mining. Outer Source Clustering

 Preparation for Data Mining:

1.1. Prepare data for mining: collect one media text and save it as .tab file using Notepad++ (file-save as - all files - .tab) in direct folder.

1.2. Open Guardian widget, set query word, data publishing range (up to 10-20 files for fast running recommended) and run search. Choose for headline and content for include.

Simple data mining clustering:



Legend comment: outer source clustering scenario: upper level - deploy, lower level - check up.

Right mouse button to open context menu. Type in for searching. Left mouse button to select.

Deploy:

3. Open Preprocess Text and set marks for: Lowercase, Regexp, Stopwords.

4. Link it to Word Cloud to see words frequency.

5. Link  Preprocess Text to Bag of Words set to count, none, eucledian.

6. Connect Bag of Words to Distances, use columns type. Regularization must be set as Euclidean.

7. Link Distances to Statistics with word count and character count.

8. Link Line Chart at the end of the fork.

9. Open Line Chart and set 2-3 plots to see coincedences.

10. Save final result as picture file and accompany with description.

Task results will be accepted until 13:00 on October, 30 in direct comment here or via Telegram.

Комментарии

Популярные сообщения