You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Internship Projects/Mentors


Title

Use of NLP for Telco Data (logs) Anonymization

Status

ACCEPTING APPLICAITONS

Difficulty

MEDIUM


Description 

This work explores the possibility and effectiveness of NLP techniques for anonymizing Telco Data (logs). The goal is to answer the following questions?

  1. Are there sufficient and usable dataset available in the public domain to carry out this work?
  2. What are the current techniques that are used for anonymizing log-data?
  3. Do NLP-based techniques provide any efficiency (compared against existing techniques).?
  4. Are available libraries and tools available (Ex: presidio) sufficient? 
  5. What types of log-data are applicable for NLP-based techniques?
  6. Does anonymizing log-data affect (ex: Predictability power, detection accuracy, etc.) any of the ML-techniques ?


Apart from answering the above questions, the outcome of this work also includes a tool that will take log-data, and anonymize it using an NLP-based approach.

Additional Information

Due to the request being part time, Toth is teaming with the general Anuket project so that the idea would be that the intern would be able to work on both projects.  

Learning Objectives

Working on this project will help the Student to:

1. Understand the Telco-Data, and the need for anonymization.

2. Understand different techniques and methodologies of anonymization

3. Master the use of NLP for anonymization 

Expected Outcome

A tool that takes original data and outputs anonymized data.

Relation to LF Networking 

Anuket, Thoth

Education Level

Undergrad (BE)

Skills

  1. Python
  2. Basics of Data Analytics and ML.
  3. Basics of NLP.

Future plans

This tool will get merged with other anonymization techniques.

Preferred Hours and Length of Internship

3 Months Part-Time or 1.5 Months Full-Time (½ of the LF Mentorship Program duration).

Mentor(s) Names and Contact Info

Click here to apply


Sridhar Rao, srao@linuxfoundation.org, sridharkn, The Linux Foundation.



  • No labels