In our lab, we collected a variety of behavioral data
from experimental subjects across four distinct groups. The challenge was that this data
required extensive reformatting before it can be transferred to statistical analysis
software for group-by-group testing and graphing. This manual reformatting process was
not only time-consuming but also susceptible to human error. Moreover, the statistical
software used generated basic and aesthetically unappealing graphs, which limited the
quality of data visualization.
Solution:
To overcome the inefficiencies and error-prone nature
of manual data reformatting, a PySimpleGUI desktop application was created, tailored
for processing behavioral data from uniformly formatted Excel datasheets. The application
boasts a user-friendly graphical interface, enabling lab members to quickly extract
relevant data, conduct group-by-group analyses, and generate comparative graphs with
minimal effort. This solution allows users to perform these tasks with a few clicks,
delivering analyzed results and high-quality visualizations of group differences within
seconds, thereby streamlining the workflow and improving the quality of data representation.
PySimpleGUI
Graphs generated with SEM & ANOVA tables
Short video demonstrating the desktop application's functionality
Least Different Group Matching Analysis & Visualization
Problem:
In experiments involving pair-housed subjects that
undergo an initial baseline test and based on this baseline data are assigned to subsequent groups,
the challenge lies in assigning them to two groups (control vs treatment) with
minimal initial differences between groups. However, doing so ensures that any biases from
previous conditions do not influence the outcomes of subsequent experiments.
Moreover, it's crucial to maintain cage mates within the same group to prevent
potential cross-group effects—if one subject experiences stress due to its
group's treatment, it could impact the results of its paired partner in future
tests. Compounding this challenge of identifying the two groups that exhibit the
absolute least differences, is the fact that the task could potentially involve
analyzing and comparing hundreds, if not thousands, of group assignment
possibilities. Such extensive analysis would be impractical for a human to
perform manually.
Solution:
A PySimpleGUI desktop application was developed to analyze individual
data from cage mate pairs stored in an Excel file. The application systematically evaluates every possible
combination of group assignments to identify the combination where the
data points exhibit the least overall difference while preserving
their pairing. This analysis
includes calculating and recording various metrics in an Excel
output file, such as:
All possible group combinations considered
The means of all possible group combinations
The absolute differences between data points
across all group combinations
The specific group assignments resulting in
the smallest mean difference between groups
The program also displays a graphical representation of the group assignments using Matplotlib.
The line plot depicts the data points of the least
different group assignments, while also maintaining
adherence to cage mate pairing. These lines are plotted to visually
illustrate the minimal possible distance between data points.
The application’s output provides a comprehensive overview of the
analysis steps and results, facilitating informed decision-making
for optimal group assignment in experimental setups.
PySimpleGUI
Graph generated showing hypothetical
two-group assignments that are the least different between each other
out of all possible group assignments
Short video demonstrating the desktop application's functionality
Robinhood Stock Data Analysis & Visualization
Problem:
The challenge was to develop a systematic
approach for evaluating the performance of a portfolio of individual
stocks over time.
Solution:
To address the challenge, a Python script was developed using the
Robin_Stocks library and the Robinhood API to record daily stock
information into a CSV file. The data is then analyzed and
visualized with the Pandas and Matplotlib libraries. The script
aggregates the performance data of individual stock selections
and compares it against the aggregate performance of two
diversified ETFs. This comparison is visualized through graphs
that illustrate total return and other key performance metrics,
enabling easy evaluation and benchmarking of the portfolio’s
performance over time.
Script created using the Robin_Stocks
Python library and the Robinhood API
Top line graph displaying dollar value change over time.
Line graph displaying percent change over time.
Bar graph displaying total change currently.
Pie chart displaying asset class's portfolio distribution.
Pie chart displaying each asset class's current return on investment.
Wheel Running Data Analysis & Visualization
Problem:
In studies investigating the effects of exercise using a running wheel,
our lab used a program that tracked total revolutions every minute.
This data was logged into an Excel worksheet each minute over a 24-hour
cycle creating one sheet per subject for each 24-hour cycle. For many
subjects over extended periods, this generates millions of data points.
The analytical challenge included counting only valid running bouts
(≥3 revolutions per minute) and distinguishing running behaviors between
the light (inactive) and dark (active) phases of the subject’s light cycle.
The goal was to efficiently process and analyze this extensive dataset to
quantify various aspects of running behavior while considering phase-specific
activity patterns, providing meaningful insights into the effects of exercise.
Solution:
To address the data management challenges posed by these running wheel experiments,
a user-friendly graphical interface program was developed. This program compiles
and organizes the raw data from Excel files into a ‘Combined Data’ sheet alongside
the original data sheets, categorizing running behaviors and distinguishing them by
the subjects’ inactive (light) and active (dark) phases. Automating data calculation
and aggregation significantly reduced the manual workload and minimizes errors,
accomplishing in seconds what would otherwise have taken hours. Additionally, the program
features a bar graph visualization tool, providing users with a quick and
informative visual analysis of the dataset.
PySimpleGUI
Graph generated showing mean values
of different wheel running behaviors during the animal's active 12hr
light-cycle versus their inactive 12hr light-cycle.
Short video demonstrating the desktop application's functionality
Financial Data Analysis & Visualization
Problem:
Credit card companies and
banks typically offer basic charts that group spending into
broad categories. While useful for a general overview, these
charts fail to monitor spending trends over time and frequently
miscategorized transactions, leading to inaccuracies. These
inaccuracies can mislead users, affecting their ability to make
informed financial decisions and plan budgets effectively.
A more personalized and precise system that accurately
categorizes spending and tracks detailed spending patterns over
time, provides more reliable data analysis to support financial planning.
Solution:
To address the limitations of basic spending charts, a
Python script was developed to offer a personalized and
accurate view of spending data over time. This script
processes CSV files from bank and credit card statements
using the Pandas library, categorizing and analyzing the
data in detail. It then generates visualizations with
Matplotlib, such as stacked bar graphs, to present monthly
spending across specific categories. This approach provides
clear, detailed insights into financial activity, facilitating
better tracking and analysis of expenditures.
Checking account activity divided into categories by month
Credit Card expenditures divided into categories by month
Activity Text File Data Extraction to Excel & Visualization
Problem:
In our lab, we quantified
and analyzed the locomotion activity of experimental test
subjects using an activity tracking chamber. This chamber
records the subjects’ movements via infrared beam breaks
and the software outputs comprehensive data in a .txt file
across various categories. However, our analysis required
only specific portions of this data. The large amount of superfluous data in text format made locating
extracting, and transferring the relevant data into a
formatted Excel spreadsheet for statistical analysis is a
manual, labor-intensive task that consumed significant time.
The process was not only inefficient, but prone to errors, especially with large datasets from
multiple subjects over extended experimental trials.
Solution:
To address the inefficiencies in handling locomotion activity
data, a desktop application was developed with a user-friendly
interface that automates the data processing workflow.
The application transforms the raw .txt file into a
standardized format, selectively extracts the relevant data
categories required for analysis, and transfers this data
into a properly formatted Excel spreadsheet. This automation
reduces the processing time from hours to seconds and eliminates
the risk of human errors associated with manual data handling.
In addition to data handling, the program allows the user to output
a bar graph of activity data over time for rapid analysis and anomaly detection.
PySimpleGUI
Graph generated showing mean values of all subjects
locomotion for each 5min time period over the 1hr trial period along with SEM of each
time bin. Graph is charactoristic of the inital high exploritory behavior of a novel
environment that tapers off over time.
Short video demonstrating the desktop application's functionality