• Skip to main content
  • Skip to after header navigation
  • Skip to site footer
ERN: Emerging Researchers National Conference in STEM

ERN: Emerging Researchers National Conference in STEM

  • About
    • About AAAS
    • About the NSF
    • About the Conference
    • Partners/Supporters
    • Project Team
  • Conference
  • Abstracts
    • Undergraduate Abstract Locator
    • Graduate Abstract Locator
    • Abstract Submission Process
    • Presentation Schedules
    • Abstract Submission Guidelines
    • Presentation Guidelines
  • Travel Awards
  • Resources
    • Award Winners
    • Code of Conduct-AAAS Meetings
    • Code of Conduct-ERN Conference
    • Conference Agenda
    • Conference Materials
    • Conference Program Books
    • ERN Photo Galleries
    • Events | Opportunities
    • Exhibitor Info
    • HBCU-UP/CREST PI/PD Meeting
    • In the News
    • NSF Harassment Policy
    • Plenary Session Videos
    • Professional Development
    • Science Careers Handbook
    • Additional Resources
    • Archives
  • Engage
    • Webinars
    • ERN 10-Year Anniversary Videos
    • Plenary Session Videos
  • Contact Us
  • Login

Data Science using Sensor Data Implemented on a Wearable IOT Edison Platform with Machine Learning Applicationss

Undergraduate #230
Discipline: Computer Sciences and Information Management
Subcategory: Computer Science & Information Systems

Alice Ngoc Lam - Kansas State University
Co-Author(s): William Hsu and Paula Mendez, Kansas State University, Manhattan, KS



Machine learning is used in a variety of industries as a tool to cluster data to see similar trends. Simply, machine learning produces models based past experience that enables improved performance on future instances of a task, such as classification, prediction, or pattern recognition. Here, the experience consists of historical sensor data and the task is prediction of animal disease transmission. In this project, the input consists of second-by-second RFID proximity data among susceptible bovine specimens from an experimental herd of 70 cattle, along with their daily temperature and isolation history. The goal of this work was to analyze proximity data to prepare it as training data for supervised classification learning, enabling disease transmission and propagation models to be built. Python, a programming language, was used to build the algorithm. This language was chosen due to its mix of functionalities and tool packages for development. The majority of this project was spent on data preparation as it is a crucial step that had to be completed before applying the machine learning algorithm. The data was collected from the Beef Cattle Institute (BCI) and hosted on a relational database server (SQL) and accessible via both database clients such as MySQL and remote file access (using secure shell, or SSH). Two database queries were implemented to produce alternative training data: one consisting of a ‘Group-By’ count of tagged cattle who came within a specified radius threshold of the specimen (candidates for exposure) on an eighty-six thousand four hundred second interval, and another consisting of the actual list (bit vector) of cattle. If time allows, an additional classifier, logistic regression, a machine learning model that analyzes data to explain the relationship between one variable to another, will be applied to the training data. The same algorithm derived from the BCI data will be applied to a demo on the Edison, a tiny wearable computer. An accurate algorithm applied to the data collected from a smart sensor on the Edison should output predicted results that is anticipated for. Further research will consist of applying reinforcement learning, a different machine learning method, to the algorithm and will be used to see if its accuracy and learning rate efficiency increases.

Funder Acknowledgement(s): This work was supported by the National Science Foundation grant No. 1305059 (KS-LSAMP). Additional research made possible by the Knowledge Discoveries of Databases Lab.

Faculty Advisor: William H. Hsu, bhsu@ksu.edu

Role: I handled data collection hosted on a MySQL database server. Most of my time was spent on generating queries consisting of a ‘Group-By’ count of tagged cattle in pairs who came within a specified radius threshold which the Beef Cattle Institute provided as 0.09 meters. To narrow down on more accurate candidates, a thirty second window was applied to aggregate the number of pairs in contact for at least a second. The results of the query displayed all the cows by selected day and count of cows that came within the threshold. Later, machine learning applications were introduced.

Sidebar

Abstract Locators

  • Undergraduate Abstract Locator
  • Graduate Abstract Locator

This material is based upon work supported by the National Science Foundation (NSF) under Grant No. DUE-1930047. Any opinions, findings, interpretations, conclusions or recommendations expressed in this material are those of its authors and do not represent the views of the AAAS Board of Directors, the Council of AAAS, AAAS’ membership or the National Science Foundation.

AAAS

1200 New York Ave, NW
Washington,DC 20005
202-326-6400
Contact Us
About Us

  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
  • YouTube

The World’s Largest General Scientific Society

Useful Links

  • Membership
  • Careers at AAAS
  • Privacy Policy
  • Terms of Use

Focus Areas

  • Science Education
  • Science Diplomacy
  • Public Engagement
  • Careers in STEM

Focus Areas

  • Shaping Science Policy
  • Advocacy for Evidence
  • R&D Budget Analysis
  • Human Rights, Ethics & Law

© 2023 American Association for the Advancement of Science