| name | pandas |
| description | Pandas library for data manipulation and analysis. Use for loading CSV files, data transformation, aggregation, filtering, and creating output reports. |
Pandas
Pandas provides powerful data structures and analysis tools for Python.
Loading and Saving Data
import pandas as pd
# Load CSV file
df = pd.read_csv('/app/data/drug_interactions.csv')
# Display info
print(df.head())
print(df.info())
print(df.columns.tolist())
# Save to CSV
df.to_csv('/app/output/results.csv', index=False)
Filtering and Selection
# Filter by severity
severe = df[df['severity'] == 'major']
moderate = df[df['severity'].isin(['moderate', 'major'])]
# Filter by multiple conditions
critical = df[(df['severity'] == 'major') & (df['documented'] == True)]
# Select specific columns
interactions = df[['drug_1', 'drug_2', 'severity']]
Aggregation and GroupBy
# Count interactions by severity
severity_counts = df['severity'].value_counts()
# Count interactions per drug
drug_counts = pd.concat([
df['drug_1'].value_counts(),
df['drug_2'].value_counts()
]).groupby(level=0).sum()
# Group by and aggregate
summary = df.groupby('interaction_type').agg({
'severity': 'count',
'documented': 'sum'
}).rename(columns={'severity': 'count'})
Creating DataFrames from Dictionaries
# From dictionary of lists
centrality_df = pd.DataFrame({
'drug': list(drugs),
'degree_centrality': degree_values,
'betweenness_centrality': betweenness_values
})
# From list of dictionaries
results = []
for drug in drugs:
results.append({
'drug': drug,
'degree': G.degree(drug),
'centrality': centrality[drug]
})
results_df = pd.DataFrame(results)
Sorting and Ranking
# Sort by centrality
sorted_df = centrality_df.sort_values('degree_centrality', ascending=False)
# Get top N
top_drugs = sorted_df.head(10)
# Add rank column
centrality_df['rank'] = centrality_df['degree_centrality'].rank(ascending=False)
Merging DataFrames
# Merge drug info with centrality
drug_info = pd.read_csv('/app/data/drug_info.csv')
merged = centrality_df.merge(drug_info, on='drug', how='left')
# Join severe interactions with drug details
severe_with_info = severe.merge(
drug_info,
left_on='drug_1',
right_on='drug',
how='left'
)
Saving Outputs
# Save filtered interactions
severe_df.to_csv('/app/output/severe_interactions.csv', index=False)
# Save with specific columns
output_cols = ['drug_1', 'drug_2', 'severity', 'interaction_type']
df[output_cols].to_csv('/app/output/interactions_summary.csv', index=False)