FreezeCasting.net: A Central Repository of Freeze-Casting Data

Demonstrative SQL-Python/Jupyter notebook for utilization of the FreezeCasting.net database


To utilize this notebook, you will need:

Python and Jupyter notebook (the 32-bit version of Python is sufficient for working with the database in its current form).

The following Python packages: matplotlib, pandas, NumPy, SciPy, and MySQLdb.

The FreezeCasting.net database in SQL format, which is available here.

To import data via csv, an alternative form of this notebook is avaialble, here.

Bundled packages contianing most of the dependencies you will need can be obtained for free. The demonstrative notebook provided here was built using the Anaconda package, which includes the above packages. The Matplotlib extension, Basemap is needed to run the plot in Figure 2. The extension is not included in the Anaconda package. However, it may be installed via pip; corresponding lines of code are provided in the Figure 2 code block.

MySQL can be downloaded here.

Note: You may view the static, html format of the notebook here without installing Python.

Questions/concerns? Email: info@freezecasting.net

Notebook set-up

In [1]:
# Enables inline-plot rendering
%matplotlib inline
%config InlineBackend.figure_formats = {'png', 'retina'}
# Utilized to create and work with dataframes
import pandas as pd
import numpy as np
import math as m
# MATPLOTLIB
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import matplotlib as mpl
import matplotlib.cm as cm
import matplotlib.colors as mcolors
from matplotlib import rc
from matplotlib.ticker import MultipleLocator, FormatStrFormatter
from pylab import rcParams
from matplotlib.ticker import AutoMinorLocator
import scipy.stats as stats
import statsmodels.api as sm
import matplotlib.ticker as ticker
# For point density plots:
from scipy.stats import gaussian_kde
import statsmodels.api as sm
# Set ipython's max row display
pd.set_option('display.max_row', 1000)
pd.set_option('display.max_columns', 100)
pd.set_option('display.max_colwidth', 2000)
In [2]:
# Specify matplotlib default plotting aesthetics 
def default_plotting():
    #fig
    plt.rcParams['savefig.dpi'] = '100'
    plt.rcParams['savefig.format'] = ('png')
    #font
    plt.rcParams['font.family'] = 'Georgia'
    #plt.rcParams['font.size'] = 22
    plt.rcParams['axes.labelsize'] = 18
    plt.rcParams['axes.titlesize'] = 18
    plt.rcParams['xtick.labelsize'] = 18
    plt.rcParams['ytick.labelsize'] = 18
    #legend
    #plt.rcParams['legend.fontsize'] = 12
    #axes
    plt.rcParams['axes.linewidth'] = 1
    plt.rcParams['axes.edgecolor'] = 'black'
    plt.rcParams['axes.facecolor'] = 'white'
    plt.rcParams['axes.grid'] = False
    plt.rcParams['axes.labelpad'] = 10
    plt.rcParams['axes.axisbelow'] = False
    #tick marks
    plt.gca().xaxis.set_ticks_position('bottom')
    plt.gca().yaxis.set_ticks_position('left')
    plt.rcParams['xtick.major.size'] = 7
    plt.rcParams['ytick.major.size'] = 7
    plt.rcParams['ytick.minor.size'] = 3
    plt.rcParams['xtick.minor.size'] = 3
    plt.rcParams['ytick.major.width'] = 1
    plt.rcParams['ytick.minor.width'] = 1
    plt.rcParams['xtick.major.width'] = 1
    plt.rcParams['xtick.minor.width'] = 1
    plt.rcParams['xtick.direction'] = 'in'
    plt.rcParams['ytick.direction'] = 'in'
default_plotting()
plt.close()

Connect to MySQL database

In [3]:
import MySQLdb
#command-line arguments
import sys
In [4]:
conn = MySQLdb.connect(host="hostname",
    user="userid",
    passwd="password",
    db="freezecasting_opendata")
# Create proxy cursor to interact with the database
cursor = conn.cursor()

Data import

Each MySQL table is imported via independent SQL queries. To work with the database tables within Python, queried data is placed within dataframes using the pandas toolkit.

For purposes of demonstrating table linkage, we will import most of the database tables, merge them into a single panda dataframe, plot various relationships, and export select data columns to a csv file. However, it is not necessary to import all of the database tables, and in most cases, it is much more practical to import only tables containing variables of interest. The only mandatory imports for joining secondary tables are the "papers" and "samples" table.

Import and merge the following 'primary' tables: (i) authors, (ii) papers, (iii) samples.

The "authors" table provides contact information for the corresponding author, the "papers" table provides the data necessary for citing the data, and the "samples" table provides the "sample_ID" necessary for linking secondary tables.

In [5]:
# IMPORT AUTHORS TABLE
cursor.execute('SELECT * FROM authors')
rows = cursor.fetchall()
authors = pd.DataFrame( [[ij for ij in i] for i in rows])
authors.rename(columns={0: 'author_ID', 1: 'name_author', 2:'affiliation',
                        3: 'city', 4: 'country', 5: 'long', 6: 'lat', 
                        7: 'email', 8: 'last_updated'},
               inplace = True);
# drop last_update column
authors = authors.drop('last_updated', axis = 1)
# Author_ID is imported as float; change to integer data type
authors['author_ID'] = authors['author_ID'].astype('int')

# IMPORT PAPERS TABLE
cursor.execute('SELECT * FROM papers')
rows = cursor.fetchall()
papers = pd.DataFrame( [[ij for ij in i] for i in rows])
# column names; i.e., variable names, can be changed below (just be sure to maintain order;
# see pandas documentation for reordering columns).
papers.rename(columns={0: 'paper_ID', 1: 'author_ID', 2:'peer_review',
                       3: 'usage', 4: 'authors', 5: 'title',
                       6: 'journal', 7: 'volume', 8: 'issue', 
                       9: 'pages', 10: 'year', 11: 'doi', 
                       12: 'material', 13: 'material_group',
                       14: 'last_updated'},
              inplace = True);
# Drop last_updated column 
# --A last_updated column is contained within all tables; it is an automatic timestamp that
# shows the last time the corresponding row was updated. These columns are unnecessary here, 
# so we will drop the column from each table.
papers = papers.drop('last_updated', axis = 1)
papers['year'] = papers['year'].astype('int')
papers['paper_ID'] = papers['paper_ID'].astype('int')

# IMPORT SAMPLES TABLE
cursor.execute('SELECT * FROM samples')
rows = cursor.fetchall()
samples = pd.DataFrame( [[ij for ij in i] for i in rows])
samples.rename(columns={0: 'sample_ID', 1: 'paper_ID', 2: 'number', 3: 'material_ID', 
                       4: 'vf_total', 5: 'particles_total', 6: 'fluids_total',
                       7: 'composite', 8: 'last_updated'},
              inplace = True);
# drop last_update column
samples = samples.drop('last_updated', axis = 1)
samples['sample_ID'] = samples['sample_ID'].astype('int')

# Merge papers and authors tables to link corresponding author contact data
# --author_ID acts as primary key for authors table and foreign key for papers table
papers = pd.merge(papers, authors, on = 'author_ID', how = 'outer')

# Merge papers and samples table to link sample data to citation data
# --paper_ID acts as primary key for papers table and foreign key for samples table
samples = pd.merge(samples, papers, on = 'paper_ID', how = 'outer')

# Uncomment and run the following to verify column names and/or print the dataframe header
#samples.columns.values
#samples.head()

data = samples[(samples['sample_ID']) > 0]

Import secondary tables

Import suspension tables and merge with the dataframe

In [6]:
# IMPORT SUSPENSION TABLE
cursor.execute('SELECT * FROM suspension')
rows = cursor.fetchall()
suspension = pd.DataFrame( [[ij for ij in i] for i in rows])
suspension.rename(columns={0: 'suspension_ID', 1: 'sample_ID', 2: 'time_mill', 
                       3: 'pH_susp', 4: 'viscosity_susp', 5: 'zeta_susp',
                       6: 'last_updated'},
              inplace = True);
# drop last_update column
suspension = suspension.drop('last_updated', axis = 1)
# suspension_ID is a primary key for the suspension table; it is not needed here
suspension = suspension.drop('suspension_ID', axis = 1)
In [7]:
#-----FLUID TABLES-----#
# Fluid properties table
cursor.execute('SELECT * FROM props_fluids')
rows = cursor.fetchall()
props_fluids = pd.DataFrame( [[ij for ij in i] for i in rows])
props_fluids.rename(columns={0: 'props_fluid_ID', 1: 'name_fluid', 2: 'density_liq_fluid', 
                             3: 'density_sol_fluid', 4: 'thermal_cond_liq_fluid',
                             5: 'thermal_cond_sol_fluid', 6: 'last_updated'},
              inplace = True);
props_fluids = props_fluids.drop('last_updated', axis = 1)
# Fluid 1 
cursor.execute('SELECT * FROM susp_fluid_1')
rows = cursor.fetchall()
susp_fluid_1 = pd.DataFrame( [[ij for ij in i] for i in rows])
susp_fluid_1.rename(columns={0: 'fluid_1_ID', 1: 'sample_ID', 2: 'props_fluid_ID', 
                       3: 'vf_fluid_1', 4: 'last_updated'},
              inplace = True);
susp_fluid_1 = susp_fluid_1.drop('last_updated', axis = 1)
susp_fluid_1 = susp_fluid_1.drop('fluid_1_ID', axis = 1)
# Merge fluid properties table with fluid 1 table; 'props_fluid_ID' is primary key
# for the fluid properties table and foreign key for the susp_fluid_1 table.
susp_fluid_1 = pd.merge(props_fluids, susp_fluid_1, on = 'props_fluid_ID', how = 'outer')
# rename the 'props_fluid_id' and other property columns such that they can be differentiated for fluid 1 and fluid 2
susp_fluid_1.rename(columns={'props_fluid_ID':'props_fluid_ID1'}, inplace=True)
susp_fluid_1.rename(columns={'name_fluid':'name_fluid1'}, inplace=True)
susp_fluid_1.rename(columns={'density_liq_fluid':'density_liq_fluid1'}, inplace=True)
susp_fluid_1.rename(columns={'density_sol_fluid':'density_sol_fluid1'}, inplace=True)
susp_fluid_1.rename(columns={'thermal_cond_liq_fluid':'thermal_cond_liq_fluid1'}, inplace=True)
susp_fluid_1.rename(columns={'thermal_cond_sol_fluid':'thermal_cond_sol_fluid1'}, inplace=True)

# Fluid 2
cursor.execute('SELECT * FROM susp_fluid_2')
rows = cursor.fetchall()
susp_fluid_2 = pd.DataFrame( [[ij for ij in i] for i in rows])
susp_fluid_2.rename(columns={0: 'fluid_2_ID', 1: 'sample_ID', 2: 'props_fluid_ID', 
                       3: 'vf_fluid_2', 4: 'last_updated'},
              inplace = True);
susp_fluid_2 = susp_fluid_2.drop('last_updated', axis = 1)
susp_fluid_2 = susp_fluid_2.drop('fluid_2_ID', axis = 1)
susp_fluid_2 = pd.merge(props_fluids, susp_fluid_2, on = 'props_fluid_ID', how = 'outer')
susp_fluid_2.rename(columns={'props_fluid_ID':'props_fluid_ID2'}, inplace=True)
susp_fluid_2.rename(columns={'name_fluid':'name_fluid2'}, inplace=True)
susp_fluid_2.rename(columns={'density_liq_fluid':'density_liq_fluid2'}, inplace=True)
susp_fluid_2.rename(columns={'density_sol_fluid':'density_sol_fluid2'}, inplace=True)
susp_fluid_2.rename(columns={'thermal_cond_liq_fluid':'thermal_cond_liq_fluid2'}, inplace=True)
susp_fluid_2.rename(columns={'thermal_cond_sol_fluid':'thermal_cond_sol_fluid2'}, inplace=True)

# Merge fluids tables
fluids = pd.merge(susp_fluid_1, susp_fluid_2, on = 'sample_ID', how = 'outer')
fluids = fluids[(fluids['sample_ID']) > 0]

# Merge suspension table
suspension = pd.merge(fluids, suspension, on = 'sample_ID', how = 'outer')
In [8]:
#-----PARTICLE TABLES-----#
# Particle properties table
cursor.execute('SELECT * FROM props_particles')
rows = cursor.fetchall()
props_particles = pd.DataFrame( [[ij for ij in i] for i in rows])
props_particles.rename(columns={0: 'props_part_ID', 1: 'name_part', 2: 'description_part', 
                             3: 'density_part', 4: 'thermal_cond_part',
                             5: 'last_updated'},
              inplace = True);
props_particles = props_particles.drop('last_updated', axis = 1)
# Particle 1 
cursor.execute('SELECT * FROM susp_part_1')
rows = cursor.fetchall()
susp_part_1 = pd.DataFrame( [[ij for ij in i] for i in rows])
susp_part_1.rename(columns={0: 'particle_1_ID', 1: 'sample_ID', 2: 'props_part_ID', 
                       3: 'shape_part_1', 4: 'dia_part_1', 5: 'diaI_part_1', 
                           6: 'length_part_1', 7: 'thick_part_1', 8: 'vf_part_1', 
                           9: 'last_updated'},
              inplace = True);
susp_part_1 = susp_part_1.drop('last_updated', axis = 1)
susp_part_1 = susp_part_1.drop('particle_1_ID', axis = 1)
# Merge particle properties table with particle 1 table; 'props_part_1' is primary key
# for the particle properties table and foreign key for the susp_part_1 table.
susp_part_1 = pd.merge(props_particles, susp_part_1, on = 'props_part_ID', how = 'outer')
# rename the 'props_part_id' column and other property columns such that it can be differentiated for particle 1 and particle 2
susp_part_1.rename(columns={'props_part_ID':'props_part_ID1'}, inplace=True)
susp_part_1.rename(columns={'name_part':'name_part1'}, inplace=True)
susp_part_1.rename(columns={'description_part':'description_part1'}, inplace=True)
susp_part_1.rename(columns={'density_part':'density_part1'}, inplace=True)
susp_part_1.rename(columns={'thermal_cond_part':'thermal_cond_part1'}, inplace=True)
susp_part_1 = susp_part_1[(susp_part_1['sample_ID']) > 0]

# Particle 2
cursor.execute('SELECT * FROM susp_part_2')
rows = cursor.fetchall()
susp_part_2 = pd.DataFrame( [[ij for ij in i] for i in rows])
susp_part_2.rename(columns={0: 'particle_2_ID', 1: 'sample_ID', 2: 'props_part_ID', 
                       3: 'shape_part_2', 4: 'dia_part_2', 5: 'diaI_part_2', 
                           6: 'length_part_2', 7: 'thick_part_2', 8: 'vf_part_2', 
                           9: 'last_updated'},
              inplace = True);
susp_part_2 = susp_part_2.drop('last_updated', axis = 1)
susp_part_2 = susp_part_2.drop('particle_2_ID', axis = 1)
# Merge particle properties table with particle 2 table; 'props_part_2' is primary key
# for the particle properties table and foreign key for the susp_part_2 table.
susp_part_2 = pd.merge(props_particles, susp_part_2, on = 'props_part_ID', how = 'outer')
# rename the 'props_part_id' column and other property columns such that it can be differentiated for particle 1 and particle 2
susp_part_2.rename(columns={'props_part_ID':'props_part_ID2'}, inplace=True)
susp_part_2.rename(columns={'name_part':'name_part2'}, inplace=True)
susp_part_2.rename(columns={'description_part':'description_part2'}, inplace=True)
susp_part_2.rename(columns={'density_part':'density_part2'}, inplace=True)
susp_part_2.rename(columns={'thermal_cond_part':'thermal_cond_part2'}, inplace=True)
susp_part_2 = susp_part_2[(susp_part_2['sample_ID']) > 0]

# Merge particle tables, then merge particle table with suspension table
particles = pd.merge(susp_part_1, susp_part_2, on = 'sample_ID', how = 'outer')
suspension = pd.merge(particles, suspension, on = 'sample_ID', how = 'outer')
In [9]:
#-----ADDITIVE TABLES-----#
# Additive properties table
cursor.execute('SELECT * FROM props_adds')
rows = cursor.fetchall()
props_adds = pd.DataFrame( [[ij for ij in i] for i in rows])
props_adds.rename(columns={0: 'props_add_ID', 1: 'name_add', 2: 'density_add', 
                       3: 'molecular_wt_add', 4: 'last_updated'},
              inplace = True);
props_adds = props_adds.drop('last_updated', axis = 1)

# Binder 1 
cursor.execute('SELECT * FROM susp_bind_1')
rows = cursor.fetchall()
susp_bind_1 = pd.DataFrame( [[ij for ij in i] for i in rows])
susp_bind_1.rename(columns={0: 'bind_1_ID', 1: 'sample_ID', 2: 'props_add_ID', 
                       3: 'wf_bind_1', 4: 'last_updated'},
              inplace = True);
susp_bind_1 = susp_bind_1.drop('last_updated', axis = 1)
susp_bind_1 = susp_bind_1.drop('bind_1_ID', axis = 1)
susp_bind_1 = pd.merge(props_adds, susp_bind_1, on = 'props_add_ID', how = 'outer')
susp_bind_1.rename(columns={'props_add_ID':'props_bind1_ID'}, inplace=True)
susp_bind_1.rename(columns={'name_add':'name_bind1'}, inplace=True)
susp_bind_1.rename(columns={'density_add':'density_bind1'}, inplace=True)
susp_bind_1.rename(columns={'molecular_wt_add':'molecular_wt_bind1'}, inplace=True)
susp_bind_1 = susp_bind_1[(susp_bind_1['sample_ID']) > 0]
# Binder 2
cursor.execute('SELECT * FROM susp_bind_2')
rows = cursor.fetchall()
susp_bind_2 = pd.DataFrame( [[ij for ij in i] for i in rows])
susp_bind_2.rename(columns={0: 'bind_2_ID', 1: 'sample_ID', 2: 'props_add_ID', 
                       3: 'wf_bind_2', 4: 'last_updated'},
              inplace = True);
susp_bind_2 = susp_bind_2.drop('last_updated', axis = 1)
susp_bind_2 = susp_bind_2.drop('bind_2_ID', axis = 1)
susp_bind_2 = pd.merge(props_adds, susp_bind_2, on = 'props_add_ID', how = 'outer')
susp_bind_2.rename(columns={'props_add_ID':'props_bind2_ID'}, inplace=True)
susp_bind_2.rename(columns={'name_add':'name_bind2'}, inplace=True)
susp_bind_2.rename(columns={'density_add':'density_bind2'}, inplace=True)
susp_bind_2.rename(columns={'molecular_wt_add':'molecular_wt_bind2'}, inplace=True)
susp_bind_2 = susp_bind_2[(susp_bind_2['sample_ID']) > 0]
# Dispersant 1
cursor.execute('SELECT * FROM susp_disp_1')
rows = cursor.fetchall()
susp_disp_1 = pd.DataFrame( [[ij for ij in i] for i in rows])
susp_disp_1.rename(columns={0: 'disp_1_ID', 1: 'sample_ID', 2: 'props_add_ID', 
                       3: 'wf_disp_1', 4: 'last_updated'},
              inplace = True);
susp_disp_1 = susp_disp_1.drop('last_updated', axis = 1)
susp_disp_1 = susp_disp_1.drop('disp_1_ID', axis = 1)
susp_disp_1 = pd.merge(props_adds, susp_disp_1, on = 'props_add_ID', how = 'outer')
susp_disp_1.rename(columns={'props_add_ID':'props_disp1_ID'}, inplace=True)
susp_disp_1.rename(columns={'name_add':'name_disp_1'}, inplace=True)
susp_disp_1.rename(columns={'density_add':'density_disp_1'}, inplace=True)
susp_disp_1.rename(columns={'molecular_wt_add':'molecular_wt_disp_1'}, inplace=True)
susp_disp_1 = susp_disp_1[(susp_disp_1['sample_ID']) > 0]
# Dispersant 2
cursor.execute('SELECT * FROM susp_disp_2')
rows = cursor.fetchall()
susp_disp_2 = pd.DataFrame( [[ij for ij in i] for i in rows])
susp_disp_2.rename(columns={0: 'disp_2_ID', 1: 'sample_ID', 2: 'props_add_ID', 
                       3: 'wf_disp_2', 4: 'last_updated'},
              inplace = True);
susp_disp_2 = susp_disp_2.drop('last_updated', axis = 1)
susp_disp_2 = susp_disp_2.drop('disp_2_ID', axis = 1)
susp_disp_2 = pd.merge(props_adds, susp_disp_2, on = 'props_add_ID', how = 'outer')
susp_disp_2.rename(columns={'props_add_ID':'props_disp2_ID'}, inplace=True)
susp_disp_2.rename(columns={'name_add':'name_disp2'}, inplace=True)
susp_disp_2.rename(columns={'density_add':'density_disp_2'}, inplace=True)
susp_disp_2.rename(columns={'molecular_wt_add':'molecular_wt_disp_2'}, inplace=True)
susp_disp_2 = susp_disp_2[(susp_disp_2['sample_ID']) > 0]
# Cryoprotectant
cursor.execute('SELECT * FROM susp_cryo')
rows = cursor.fetchall()
susp_cryo = pd.DataFrame( [[ij for ij in i] for i in rows])
susp_cryo.rename(columns={0: 'cryo_ID', 1: 'sample_ID', 2: 'props_add_ID', 
                       3: 'wf_cryo', 4: 'last_updated'},
              inplace = True);
susp_cryo = susp_cryo.drop('last_updated', axis = 1)
susp_cryo = susp_cryo.drop('cryo_ID', axis = 1)
susp_cryo = pd.merge(props_adds, susp_cryo, on = 'props_add_ID', how = 'outer')
susp_cryo.rename(columns={'props_add_ID':'props_cryo_ID'}, inplace=True)
susp_cryo.rename(columns={'name_add':'name_cryo'}, inplace=True)
susp_cryo.rename(columns={'density_add':'density_cryo'}, inplace=True)
susp_cryo.rename(columns={'molecular_wt_add':'molecular_wt_cryo'}, inplace=True)
susp_cryo = susp_cryo[(susp_cryo['sample_ID']) > 0]
# Surfactant
cursor.execute('SELECT * FROM susp_surfact')
rows = cursor.fetchall()
susp_surfact = pd.DataFrame( [[ij for ij in i] for i in rows])
susp_surfact.rename(columns={0: 'surfact_ID', 1: 'sample_ID', 2: 'props_add_ID', 
                       3: 'wf_surfact', 4: 'last_updated'},
              inplace = True);
susp_surfact = susp_surfact.drop('last_updated', axis = 1)
susp_surfact = susp_surfact.drop('surfact_ID', axis = 1)
susp_surfact = pd.merge(props_adds, susp_surfact, on = 'props_add_ID', how = 'outer')
susp_surfact.rename(columns={'props_add_ID':'props_surfact_ID'}, inplace=True)
susp_surfact.rename(columns={'name_add':'name_surfact'}, inplace=True)
susp_surfact.rename(columns={'density_add':'density_surfact'}, inplace=True)
susp_surfact.rename(columns={'molecular_wt_add':'molecular_wt_surfact'}, inplace=True)
susp_surfact = susp_surfact[(susp_surfact['sample_ID']) > 0]
In [10]:
# The following additive tables pertain mainly to the freeze gel-casting technique; uncomment and run to import
# Catalyst

#cursor.execute('SELECT * FROM susp_catal')
#rows = cursor.fetchall()
#susp_catal = pd.DataFrame( [[ij for ij in i] for i in rows])
#susp_catal.rename(columns={0: 'catal_ID', 1: 'sample_ID', 2: 'props_add_ID', 
#                       3: 'wf_catal', 4: 'last_updated'},
#              inplace = True);
#susp_catal = susp_catal.drop('last_updated', axis = 1)
#susp_catal = susp_catal.drop('catal_ID', axis = 1)
#susp_catal = pd.merge(props_adds, susp_catal, on = 'props_add_ID', how = 'outer')
#susp_catal.rename(columns={'props_add_ID':'props_catal_ID'}, inplace=True)
# Crosslinker
#cursor.execute('SELECT * FROM susp_cross')
#rows = cursor.fetchall()
#susp_cross = pd.DataFrame( [[ij for ij in i] for i in rows])
#susp_cross.rename(columns={0: 'cross_ID', 1: 'sample_ID', 2: 'props_add_ID', 
#                       3: 'wf_cross', 4: 'last_updated'},
#              inplace = True);
#susp_cross = susp_cross.drop('last_updated', axis = 1)
#susp_cross = susp_cross.drop('cross_ID', axis = 1)
#susp_cross = pd.merge(props_adds, susp_cross, on = 'props_add_ID', how = 'outer')
#susp_cross.rename(columns={'props_add_ID':'props_cross_ID'}, inplace=True)
# Initiator
#cursor.execute('SELECT * FROM susp_init')
#rows = cursor.fetchall()
#susp_init = pd.DataFrame( [[ij for ij in i] for i in rows])
#susp_init.rename(columns={0: 'init_ID', 1: 'sample_ID', 2: 'props_add_ID', 
#                       3: 'wf_init', 4: 'last_updated'},
#              inplace = True);
#susp_init = susp_init.drop('last_updated', axis = 1)
#susp_init = susp_init.drop('init_ID', axis = 1)
#susp_init = pd.merge(props_adds, susp_init, on = 'props_add_ID', how = 'outer')
#susp_init.rename(columns={'props_add_ID':'props_init_ID'}, inplace=True)
# Monomer
#cursor.execute('SELECT * FROM susp_mono')
#rows = cursor.fetchall()
#susp_mono = pd.DataFrame( [[ij for ij in i] for i in rows])
#susp_mono.rename(columns={0: 'mono_ID', 1: 'sample_ID', 2: 'props_add_ID', 
#                       3: 'wf_mono', 4: 'last_updated'},
#              inplace = True);
#susp_mono = susp_mono.drop('last_updated', axis = 1)
#susp_mono = susp_mono.drop('mono_ID', axis = 1)
#susp_mono = pd.merge(props_adds, susp_mono, on = 'props_add_ID', how = 'outer')
#susp_mono.rename(columns={'props_add_ID':'props_mono_ID'}, inplace=True)
In [11]:
# Merge suspension tables
suspension = pd.merge(suspension, susp_bind_1, on = 'sample_ID', how = 'outer')
suspension = pd.merge(suspension, susp_bind_2, on = 'sample_ID', how = 'outer')
suspension = pd.merge(suspension, susp_disp_1, on = 'sample_ID', how = 'outer')
suspension = pd.merge(suspension, susp_disp_2, on = 'sample_ID', how = 'outer')
suspension = pd.merge(suspension, susp_cryo, on = 'sample_ID', how = 'outer')
suspension = pd.merge(suspension, susp_surfact, on = 'sample_ID', how = 'outer')

# Uncomment and run the code block below if freeze gel-casting additive tables were imported
#suspension = pd.merge(suspension, susp_catal_1, on = 'sample_ID', how = 'outer')
#suspension = pd.merge(suspension, susp_cross_1, on = 'sample_ID', how = 'outer')
#suspension = pd.merge(suspension, susp_init_2, on = 'sample_ID', how = 'outer')
#suspension = pd.merge(suspension, susp_mono, on = 'sample_ID', how = 'outer')

# Merge suspension table with dataframe
data = pd.merge(suspension, data, on = 'sample_ID', how = 'outer')

Import solidification, sublimation, and sintering tables, merging each group with the dataframe

In [12]:
#-----SOLIDIFICATION TABLES-----#
# Main solidification table
cursor.execute('SELECT * FROM solidification')
rows = cursor.fetchall()
solidification = pd.DataFrame( [[ij for ij in i] for i in rows])
solidification.rename(columns={0: 'solidification_ID', 1: 'sample_ID', 2: 'technique', 
                               3: 'direction', 4: 'refrigerant', 5: 'cooling_rate', 
                               6: 'temp_cold', 7: 'temp_hot', 8: 'temp_constant',
                               9: 'gravity', 10: 'gradient', 11: 'velocity', 
                               12: 'temp_nuc', 13: 'temp_susp', 14: 'last_updated'},
              inplace = True);
solidification = solidification.drop('last_updated', axis = 1)
solidification = solidification.drop('solidification_ID', axis = 1)

# Mold properties table
cursor.execute('SELECT * FROM props_mold')
rows = cursor.fetchall()
props_mold = pd.DataFrame( [[ij for ij in i] for i in rows])
props_mold.rename(columns={0: 'props_mold_ID', 1: 'name_mold_mat', 2: 'thermal_cond_mold', 
                               },
              inplace = True);
# Solidification mold table
cursor.execute('SELECT * FROM solidification_mold')
rows = cursor.fetchall()
solidification_mold = pd.DataFrame( [[ij for ij in i] for i in rows])
solidification_mold.rename(columns={0: 'mold_ID', 1: 'sample_ID', 2: 'shape_mold', 
                                    3: 'dia_mold', 4: 'height_mold', 5: 'length_mold',
                                    6: 'width_mold', 7: 'wall_mold', 8: 'fillheight', 
                                    9: 'props_mold_ID', 10: 'last_update'},
              inplace = True);
# Merge mold properties table with solidification_mold table
solidification_mold = pd.merge(props_mold, solidification_mold, on = 'props_mold_ID', how = 'outer')
# Merge solidification_mold table with solidification table
solidification = pd.merge(solidification_mold, solidification, on = 'sample_ID', how = 'outer')
# Drop rows that are not associated with a sample ID
solidification = solidification[(solidification['sample_ID']) > 0]
# Merge solidification table with dataframe
data = pd.merge(solidification, data, on = 'sample_ID', how = 'outer')
In [13]:
#-----SUBLIMATION TABLE-----#
cursor.execute('SELECT * FROM sublimation')
rows = cursor.fetchall()
sublimation = pd.DataFrame( [[ij for ij in i] for i in rows])
sublimation.rename(columns={0: 'sublimation_ID', 1: 'sample_ID', 2: 'sublimated', 
                           3: 'pressure_sub', 4:'time_sub', 5: 'temp_sub', 6: 'last_updated'},
              inplace = True);
sublimation = sublimation.drop('last_updated', axis = 1)
sublimation = sublimation.drop('sublimation_ID', axis = 1)
# Merge sublimation table with dataframe
data = pd.merge(sublimation, data, on = 'sample_ID', how = 'outer')
In [14]:
#-----SINTERING TABLES-----#
# Sinter1 table
cursor.execute('SELECT * FROM sinter_1')
rows = cursor.fetchall()
sinter_1 = pd.DataFrame( [[ij for ij in i] for i in rows])
sinter_1.rename(columns={0: 'sinter_1_ID', 1: 'sample_ID', 2: 'time_sinter_1', 
                         3: 'temp_sinter_1', 4: 'rampC_sinter_1', 5: 'rampH_sinter_1',
                         6: 'last_updated'},
              inplace = True);
sinter_1 = sinter_1.drop('last_updated', axis = 1)
sinter_1 = sinter_1.drop('sinter_1_ID', axis = 1)
# Sinter2 table
cursor.execute('SELECT * FROM sinter_2')
rows = cursor.fetchall()
sinter_2 = pd.DataFrame( [[ij for ij in i] for i in rows])
sinter_2.rename(columns={0: 'sinter_2_ID', 1: 'sample_ID', 2: 'time_sinter_2', 
                         3: 'temp_sinter_2', 4: 'rampC_sinter_2', 5: 'rampH_sinter_2',
                         6: 'last_updated'},
              inplace = True);
sinter_2 = sinter_2.drop('last_updated', axis = 1)
sinter_2 = sinter_2.drop('sinter_2_ID', axis = 1)
# Merge sintering tables, then merge sintering table with samples table
sinter = pd.merge(sinter_1, sinter_2, on = 'sample_ID', how = 'outer')
data = pd.merge(sinter, data, on = 'sample_ID', how = 'outer')

#Shrinkage table
cursor.execute('SELECT * FROM shrinkage')
rows = cursor.fetchall()
shrink = pd.DataFrame( [[ij for ij in i] for i in rows])
shrink.rename(columns={0: 'shrinkage_ID', 1: 'sample_ID', 2: 'shrink_vol', 
                         3: 'shrink_dia', 4: 'shrink_lin', 5: 'last_updated'},
              inplace = True);
shrink = shrink.drop('last_updated', axis = 1)
shrink = shrink.drop('shrinkage_ID', axis = 1)
data = pd.merge(shrink, data, on = 'sample_ID', how = 'outer')

Import microstructure and mechanical properties tables, merging each with the dataframe

In [15]:
#-----MICROSTRUCTURE TABLES-----#
cursor.execute('SELECT * FROM microstructure')
rows = cursor.fetchall()
microstructure = pd.DataFrame( [[ij for ij in i] for i in rows])
microstructure.rename(columns={0: 'micro_ID', 1: 'sample_ID', 2: 'pore_structure', 
                              3: 'porosity', 4:'spacing', 5:'pore', 6: 'wall', 
                              7: 'aspectRatio_pore', 8: 'aspectRatio_wall',
                              9: 'surface_area', 10: 'last_updated'},
              inplace = True);
microstructure = microstructure.drop('last_updated', axis = 1)
microstructure = microstructure.drop('micro_ID', axis = 1)

# Merge microstructure table with dataframe
data = pd.merge(microstructure, data, on = 'sample_ID', how = 'outer')
In [16]:
#-----MECHANICAL TABLES-----#
# Main mechanical table
cursor.execute('SELECT * FROM mechanical')
rows = cursor.fetchall()
mechanical = pd.DataFrame( [[ij for ij in i] for i in rows])
mechanical.rename(columns={0: 'mech_ID', 1: 'sample_ID', 2: 'shape_mech',
                          3: 'height_mech', 4: 'dia_mech', 5: 'length_mech', 
                          6: 'width_mech', 7: 'ratio_mech', 8: 'volume_mech', 
                          9: 'compressive', 10: 'flexural', 11: 'elastic', 
                          12: 'strain_rate', 13: 'crossheadspeed', 14: 'last_updated'},
              inplace = True);
mechanical = mechanical.drop('last_updated', axis = 1)
mechanical = mechanical.drop('mech_ID', axis = 1)

# Bulk material properties table
cursor.execute('SELECT * FROM props_mech_bulk')
rows = cursor.fetchall()
mech_bulk = pd.DataFrame( [[ij for ij in i] for i in rows])
mech_bulk.rename(columns={0: 'material_ID', 1: 'name_material', 2: 'density_material_bulk',
                          3: 'compressive_bulk', 4: 'elastic_bulk', 5: 'flexural_bulk', 
                          6: 'last_updated'},
              inplace = True);

# Merge bulk mechanical and bulk material properties tables with dataframe
data = pd.merge(mechanical, data, on = 'sample_ID', how = 'outer')
data = pd.merge(data, mech_bulk, on = 'material_ID', how = 'outer')

# Drop rows that are not associated with a sample ID
data = data[(data['sample_ID']) > 0]
# Create columns for normalized mechanical values
data['norm_compressive'] = (data['compressive']/data['compressive_bulk'])
data['norm_flexural'] = (data['flexural']/data['flexural_bulk'])
data['norm_elastic'] = (data['elastic']/data['elastic_bulk'])

Analysis

Function for printing number of papers corresponding to plot points

In [17]:
# define function to return number of papers for a specified figure
def printPapers(data, FigX):
    set_Papers = set(data['paper_ID'])
    num_Papers = len(set_Papers)
    print 'Number of papers', '(', FigX, '):', num_Papers

Explore variability in terms of how the freeze-casting technique is used

Print list of material types fabricated using the freeze-casting technique

In [37]:
materials = papers['material'].unique()
materials.sort()
print materials
# Return number of array elements, which gives us the total number of unique materials
materials.__len__()
[None nan '' 'Ag' 'Ag/PDMS' 'Akr' 'Al-12Si/Al2O3' 'Al-7Si-5Cu/TiC'
 'Al-Mg-Si/Al2O3' 'Al-Si-Mg/SiC' 'Al-Si/Al2O3' 'Al/Al2O3' 'Al/TiC' 'Al2O3'
 'Al2O3/CaP' 'Al2O3/Carbon' 'Al2O3/Chitosan/Gelatin' 'Al2O3/Cu' 'Al2O3/HAP'
 'Al2O3/Mullite' 'Al2O3/PMMA' 'Al2O3/SiC' 'Al2O3/SiO2' 'Al2O3/Y-TZP'
 'Al2O3/ZrO2' 'Al2SiO5' 'AlN' 'AlON' 'AlSi12/Al2O3' 'AlSi12\x96Al2O3'
 'Alginate' 'Alginate/Chitosan' 'Apatite' 'Apatite/Collagen'
 'Au/Ag/Pd/Pt/Fe2O3/CdSe/CdS' 'Au/PS' 'B4C' 'BADCy/NiTiNb04'
 'BADCy/NiTiNbO4' 'BAS/Si3N4' 'BCP' 'BN' 'BN/Silicone' 'BT/HAP'
 'BT/Kaolinite' 'BaTiO3' 'BaTiO3/Epoxy' 'BaTiO3/HAP' 'BaTiO3/PMMA'
 'Bentonite' 'Bentonite/PVA' 'Bioglass' 'Bioglass/HAP' 'CNT' 'CNT '
 'CNTs/Chitosan' 'CP/SiO2' 'Ca2O4Si' 'Ca3ZrSi2O9' 'CaF2' 'CaP' 'CaP/PGA'
 'CaP/PLA' 'CaSiO3' 'Carbon' 'Carbon/Chitosan' 'Carbon/PVA' 'Cellulose'
 'Cellulose/PVA' 'Chitin' 'Chitosan' 'Chitosan/CNTs' 'Chitosan/Collagen'
 'Chitosan/PGA' 'Chitosan/PLGA' 'Chitosan/PLLA' 'Chitosan/PMMA'
 'Chitosan/Silk fibroin' 'Clay' 'Clay/Aerogel' 'Co3O4' 'Cocoa' 'Collagen'
 'Cordierite' 'Cr3C2' 'Cu' 'Cu/C' 'CuSiO4' 'DEAEMA' 'Dextran' 'EDMA' 'Fe'
 'Fe2O3' 'GO' 'GO/Aerogel' 'GO/SnO2' 'GaIn/PDMS' 'GaPO4' 'Gelatin'
 'Gelatin/Genipin' 'Gelatin/PLGA' 'Glass' 'Graphene' 'Graphene oxide'
 'Graphene/Aerogel' 'Graphene/Carbon' 'Graphene/Chitosan' 'Graphene/HRGO'
 'Graphene/PDMS' 'Graphene/PVA' 'Graphene/Paraffin' 'Graphene/Polystyrene'
 'Graphene/Pt/Nafion' 'Graphene/epoxy' 'HAP' 'HAP/Bioglass' 'HAP/CP'
 'HAP/Cellulose/PVA' 'HAP/Chitosan' 'HAP/Chitosan//PMMA' 'HAP/Collagen'
 'HAP/Gelatin' 'HAP/PCL' 'HAP/PMMA' 'HAP/SiO2' 'HAP/TCP' 'HAP/ZrO2'
 'Hematite' 'Kaolin' 'Kaolin/Silicate' 'Kaolinite' 'Kaolinite/SiO2' 'Kefir'
 'LSCF' 'LSCF/CGO' 'LSM/YSZ' 'Laponite' 'Latex' 'Li3V2(PO4)' 'LiFePO4/C'
 'LiFePo4/C' 'Merwinite' 'Merwinite ' 'Mg/SiC' 'MnO2' 'Mo' 'Mo/W' 'Model'
 'Mullite' 'Mullite/SiO2' 'Mullite/ZrO2' 'Nb2O5' 'Neurotrophin/Chitosan'
 'Ni' 'Ni-YSZ' 'Ni/YSZ' 'NiO-GDC' 'NiO/YSZ' 'PAN' 'PANI' 'PCL/PDDA' 'PEG'
 'PEI' 'PHEMA' 'PHPV' 'PLA' 'PLLA' 'PLLA/PHBV' 'PMMA' 'PNIPA' 'PS' 'PSS'
 'PTEGDMA' 'PU' 'PVA' 'PVDF' 'PVL' 'PZT' 'Palygorskite' 'Paracetamol'
 'Paracetamol ' 'Polystyrene' 'RF' 'Resorcinol/Formaldehyde' 'Review'
 'SCMC/PVA' 'Si/Graphene' 'Si3N4' 'Si4Al2O2N6' 'SiC' 'SiC/2024Al' 'SiO2'
 'SiO2/Aerogel' 'SiO2/Al2O3' 'SiO2/Chitosan' 'SiO2/Lactose' 'SiO2/PAA'
 'SiO2/PVA' 'SiO2/Polyisobutylene' 'SiO2/TiO2' 'SiO2/Y2O3' 'SiO2/YSZ'
 'SiOC' 'Silicate' 'Silk fibroin' 'Silk fibroin  ' 'SnO2/Graphene' 'Steel'
 'Sugar' 'TCP' 'TCP/Chitosan' 'TEOS/PDAC' 'TILITE'
 'Theophyllinum/potato starch' 'Ti' 'TiO2' 'TiO2/Chitosan' 'V2O5-CNT'
 'VOPO4/Graphene' 'W' 'Y2SiO5' 'YAG' 'YAG/fiber' 'YSZ' 'YSZ/Graphene'
 'Yb2SiO5' 'ZTA' 'Zeolite' 'ZrAc' 'ZrB2' 'ZrB2-SiC' 'ZrB2/SiC' 'ZrO2'
 'ZrO2/Al' 'ZrO2/Epoxy' 'ZrO2/PMMA' '[LiNiMn]O2' 'gelatin/agarose'
 'graphene/ZrB2' 'silk fibroin/chitosan/Fe2O3']
Out[37]:
240

Print list of binder and dispersant types that have been utilized

In [36]:
additives = data[['name_bind1', 'name_bind2', 'name_disp_1', 'name_disp2', 'name_cryo', 'name_surfact']].copy()
# Here, we are locating unique values thorughout the dataframe, irrespective of column value
additivesUnique = pd.unique(additives.values.ravel())
print additivesUnique
# Return number of array elements, which gives us the total number of unique additives
additivesUnique.__len__()
[nan 'NOI' 'Duramax B-1000' 'NH4PAA' 'PEG' 'PVA' 'Darvan 811'
 'Dolapix CE 64' 'Aquazol' 'citric acid' 'Dynol 604' 'HydroDisper A160'
 'sodium polyacrylate' 'check' 'Hypermer KD 4' 'glycerol' 'SND 6800' 'PS'
 'Darvan C' 'Dispex A40' 'gelatin' 'DuPont Elvanol' 'Duramax D3005'
 'Lopon 885' 'polyacrylic acid' 'C2H3CONH2' 'Darvan 821A' 'CMC'
 'Dolapix P62' 'Darvan CN' 'Texaphor 963' 'acetic acid' 'Hypemer KD'
 'sodium polymethacrylate' 'Dolapix PC33' 'Dolapix' 'Duramax' 'PVB'
 'Dolapix C64' 'Duramax HA-12' 'Fluka' 'PMMA' 'polyacrylate' 'saccharose'
 'polymethacrlate' 'Bentonite' 'duramax B0107' 'Texaphor 3250'
 'isopropanol alcohol' 'Triton X-100' 'sodium hexametaphosphate' 'agar'
 'PEG 300' 'Ammonium persulfate' 'HydroDisperser' 'urea' 'Darvan 812A'
 'PAAS' 'Darvan 7-N' 'Darvan 7' 'Ethanol'
 'ammonium polymethacrylate anionic dispersant' 'PVP'
 'poly-(2-ethyl-2-oxazoline' 'Perfad9100' 'starch'
 'Kalium polyacrylate Lopon 895' 'Duramax B-1001' 'polyethylene-imine'
 'A6114' 'AFP' 'Dolapix CA' 'Optapix PAF 60' 'SHMP' 'HCl' 'acacia' 'CTAB'
 'sodium alginate' 'Dolapix PC21' 'Prox B03' 'fish oil' 'DuramaxB-1022'
 'Stearic acid' 'BYK-163' 'isostearic acid' 'Easysperse' 'AES' 'TMAH'
 'CH10' 'NH4OH' 'HAO Fast 923' 'ethyl cellulose ethoecel' 'DXI' 'Na2O'
 'ovalbumin' 'polyvinyl pyrrolidione' 'PEI' 'Lioperse 511' 'laponite'
 'Kaocera' 'xanthan gum' 'Dolapix PC 33' 'Optapix AC112' 'sodium silicate'
 'NaOH' 'Zephrym PD' 'isomalt' 'xylitol' 'DISPERBYK-190']
Out[36]:
109

Figures

Figure 1. Research growth of the freeze-casting technique as illustrated by the number of papers published in peer-reviewed journals since year 2000.

In [19]:
fig1 = plt.figure(figsize=(8,5))
ax = fig1.add_subplot(111)
# Create variable for year papers were published; drop na values and convert the datatype to integer
b = papers['year']
b = b.dropna()
b = b.astype(int)

# create x-array for years
x = [2000, 2001, 2002, 2003, 2004, 2005, 2006, 
     2007, 2008, 2009, 2010, 2011, 2012, 2013, 
     2014, 2015, 2016]

# create y-array using the sum of papers published in a given year
y = [
    sum(i == 2000 for i in b), sum(i == 2001 for i in b), sum(i == 2002 for i in b),
    sum(i == 2003 for i in b), sum(i == 2004 for i in b), sum(i == 2005 for i in b),
    sum(i == 2006 for i in b), sum(i == 2007 for i in b), sum(i == 2008 for i in b),
    sum(i == 2009 for i in b), sum(i == 2010 for i in b), sum(i == 2011 for i in b),
    sum(i == 2012 for i in b), sum(i == 2013 for i in b), sum(i == 2014 for i in b),
    sum(i == 2015 for i in b), sum(i == 2016 for i in b)
]
# bar graph
ax.bar(x, y, alpha=0.7)
# set x and y limits
ax.set(xlim = [2000, 2017])
ax.set(ylim = [0, 200])
# tick positions and rotation
ax.xaxis.set_ticks(np.arange(2000, 2017, 1))
ax.yaxis.set_ticks(np.arange(50, 250, 50))
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')
for tick in ax.get_xticklabels():
    tick.set_rotation(60)
# x and y labels
ax.set_ylabel('Number of papers published')
ax.set_xlabel('Year')
# minor ticks
minorLocator1 = AutoMinorLocator()
ax.yaxis.set_minor_locator(minorLocator1)

Figure 2. Locations of corresponding authors for freeze-casting literature published between the years 2000 and 2016.

In [20]:
# Uncomment lines below to either install or upgrade matplotlib basemap 
# ! pip install basemap
# ! pip install basemap --upgrade
from mpl_toolkits.basemap import Basemap

fig2 = plt.figure(figsize=(10,10))
fig2 = Basemap(projection = 'robin', resolution = "c", area_thresh=100000.0, lon_0 = 15, lat_0 = 0)
fig2.drawcoastlines(linewidth=0.9)
fig2.drawcountries(linewidth=0.9)
# If memory issues are encountered rendering the figure, reduce the scale value below
fig2.etopo(scale=0.5, alpha=0.6)
fig2.drawmeridians(np.arange(0, 360, 30), linewidth=0.8)
fig2.drawparallels(np.arange(-90, 90, 30), linewidth=0.8)
# x and y are longitude/latitude values
x,y = fig2(authors['lat'].values, authors['long'].values)
fig2.plot(x, y, '*', markersize=10, color="red")