DataGraph

[1]:
"""
    TITLE   : Data Graph
    AUTHOR  : Nathaniel Starkman
    PROJECT : Utilipy
""";

__author__ = 'Nathaniel Starkman'
__version__ = 'May 18, 2020'

About

There are two options for inputs when writing a function: write the function to accept a wide variety of inputs or not. The former is very convenient for the user but a pain for the developer, especially on the testing end. The latter puts all the onus on the user, and data reformatting is an enormous pain. I’ve been thinking about this for a while and I think there is a third, and often better, option – an intermediate function that handles the conversion and can be applied to any function as a decorator. The advantage of this approach is threefold:

  1. the user gets a function that accepts a wide array of inputs

  2. the developer can focus on writing a purpose-built function that is easily tested and doesn’t have a million preamble lines handling different data types.

  3. the data conversion functions can independently be tested. This nicely separates testing the function from testing the input options.

I realized I had already been doing this to some extent: I’ve been writing and using decorators which do some mild data conversion on speficied arguments. This very light option can suffice, and might be best in some cases, but I ran into the problem that it’s difficult to test a decorator that’s not applied to a function. Furthermore, by locking the conversions into the decorator, I could not use them elsewhere. There are a few potential solutions, which I’ll list below.

Solution 1: just have a module with a whole bunch of conversion functions and have the decorator call these functions. This is the PanDoc solution. This solution does work, it’s just very manual. The decorator will need a huge if/else switchboard testing types. I’m not knocking PanDoc, which is fantastic, but I want something a little more automatic.

Solutin 2: A callable registry. This takes the data’s type as an argument and the desired output format and returns the correct transformation function. This solution can be built upon solution 1, and offers a great deal more flexibility.

Astropy had (and solved) a similar problem. How to transform objects in one reference frame to another reference frame. Their solution, which is quite elegant, is to build a graph whose nodes are reference frames and edges are transformation functions. In this way a coordinate frame can traverse the graph and be transformed into a frame to which there was no direct transformation function. This is what I want, but for arbitrary data types. So I will borrow and repurpose Astropy’s TransformGraph code.


Prepare

Imports

[2]:
from utilipy import ipython
# ipython.set_autoreload(2)

# BUILT-IN

# THIRD PARTY

from astropy.coordinates import SkyCoord, ICRS, Galactic
import astropy.units as u

# PROJECT-SPECIFIC

from starkman_thesis.utils.data import datagraph

set autoreload to 1
/Users/nathanielstarkman/miniconda3/lib/python3.7/site-packages/astropy/coordinates/builtin_frames/galactocentric.py:373: AstropyDeprecationWarning: In v4.1 and later versions, the Galactocentric frame will adopt default parameters that may update with time. An updated default parameter set is already available through the astropy.coordinates.galactocentric_frame_defaults ScienceState object, as described in but the default is currently still set to the pre-v4.0 parameter defaults. The safest way to guard against changing default parameters in the future is to either (1) specify all Galactocentric frame attributes explicitly when using the frame, or (2) set the galactocentric_frame_defaults parameter set name explicitly. See http://docs.astropy.org/en/latest/coordinates/galactocentric.html for more information.
  AstropyDeprecationWarning)


Adding a Transfomration

[3]:
def ICRS_to_SkyCoord(data):
        return SkyCoord(data)

dg = datagraph.TransformGraph()
dg.add_transform(ICRS, SkyCoord, ICRS_to_SkyCoord)
[4]:
data = ICRS(ra=1*u.deg, dec=10*u.deg)

dg._graph[SkyCoord][ICRS](data)
[4]:
<SkyCoord (ICRS): (ra, dec) in deg
    (1., 10.)>

By Decorator

[5]:
dg = datagraph.TransformGraph()  # make new

@dg.register(datagraph.DataTransform, ICRS, SkyCoord, func_kwargs={"sayhi": True})
def ICRS_to_SkyCoord(data, sayhi=False):
    if sayhi:
        print("Hi")
    return SkyCoord(data)
[6]:
dg._graph[SkyCoord][ICRS](data)
dg.get_transform(ICRS, SkyCoord)(data, sayhi=False)
Hi
[6]:
<SkyCoord (ICRS): (ra, dec) in deg
    (1., 10.)>
[6]:
<SkyCoord (ICRS): (ra, dec) in deg
    (1., 10.)>

Composite Transformation

[7]:
dg = datagraph.TransformGraph()

@dg.register(datagraph.DataTransform, Galactic, ICRS)
def Galactic_to_ICRS(data):
        return Galactic.transform_to(data, ICRS)

@dg.register(datagraph.DataTransform, ICRS, SkyCoord)
def ICRS_to_SkyCoord(data):
        return SkyCoord(data)

[8]:
data = Galactic(l=20*u.deg, b=10*u.deg, distance=10*u.kpc)

comp = datagraph.CompositeTransform([dg._graph[ICRS][Galactic], dg._graph[SkyCoord][ICRS]], Galactic, SkyCoord)
comp(data)

[8]:
<SkyCoord (ICRS): (ra, dec, distance) in (deg, deg, kpc)
    (267.97901121, -6.71707124, 10.)>

TransformationGraph function decorator

[9]:
dg = datagraph.TransformGraph()

@dg.register(datagraph.DataTransform, Galactic, ICRS)
def Galactic_to_ICRS(data):
    return Galactic.transform_to(data, ICRS)

@dg.register(datagraph.DataTransform, ICRS, SkyCoord, func_kwargs={"sayhi": True})
def ICRS_to_SkyCoord(data, sayhi=False):
    if sayhi:
        print("Hi")
    return SkyCoord(data)

[10]:
@dg.function_decorator(arg1=SkyCoord, arg2=ICRS)
def example_function(arg1, arg2):
    """Example Function

    Parameters
    ----------
    arg1 : SkyCoord
    arg2 : ICRS

    Other Parameters
    ----------------
    None. Unles...

    """
    if not isinstance(arg1, SkyCoord):
        raise ValueError
    if not isinstance(arg2, ICRS):
        raise ValueError
    return arg1, arg2

# /def

data = Galactic(l=20*u.deg, b=10*u.deg, distance=10*u.kpc)

example_function(data, data)
Hi
[10]:
(<SkyCoord (ICRS): (ra, dec, distance) in (deg, deg, kpc)
     (267.97901121, -6.71707124, 10.)>,
 <ICRS Coordinate: (ra, dec, distance) in (deg, deg, kpc)
     (267.97901121, -6.71707124, 10.)>)
[11]:
example_function?
Signature: example_function(arg1, arg2, *, _skip_decorator=False)
Docstring:
Example Function

Parameters
----------
arg1 : SkyCoord`
arg2 : ICRS

Other Parameters
----------------
None. Unles...
_skip_decorator : bool, optional
    Whether to skip the decorator.
    default False

Notes
-----
This function is wrapped with a data `~TransformGraph` decorator.
See `~TransformGraph.function_decorator` for details.
The transformation arguments are also attached to this function
as the attribute ``._transforms``.
The affected arguments are: arg1, arg2
File:      ~/Documents/Thesis/notebooks/<ipython-input-10-020f8c40d761>
Type:      function

[12]:
example_function._transforms
[12]:
{'arg1': astropy.coordinates.sky_coordinate.SkyCoord,
 'arg2': astropy.coordinates.builtin_frames.icrs.ICRS}

Testing DataTransform call

[13]:
def test_func(data, x, y, a=1, *args, k, l=2, **kwargs):
    print(x, y, f"a={a}", args, k, f"l={l}", kwargs)
    return data

import inspect
sig = inspect.signature(test_func)
argspec = inspect.getfullargspec(test_func)

ba = sig.bind_partial(None, -1, 0, 1, "a1", "a2", k=3, t=10)
ba.apply_defaults()

args = (-10, -11)
kwargs = {"k": 4, "t3": 11}
ba2 = sig.bind_partial(data, *args, **kwargs)

# if argspec.varargs in ba2.arguments:
#     ba2.arguments[argspec.varargs].update(ba.arguments[argspec.varargs])
if argspec.varkw in ba2.arguments:
    ba2.arguments[argspec.varkw].update(ba.arguments[argspec.varkw])

ba.arguments.update(ba2.arguments)
test_func(*ba.args, **ba.kwargs)
-10 -11 a=1 ('a1', 'a2') 4 l=2 {'t3': 11, 't': 10}
[13]:
<Galactic Coordinate: (l, b, distance) in (deg, deg, kpc)
    (20., 10., 10.)>

Testing x-match relevant transformation

[14]:
from astropy.time import Time
from astropy.table import Table

dg = datagraph.TransformGraph()

@dg.register(datagraph.DataTransform, Table, SkyCoord)
def Table_to_SkyCoord(data):
    """`~Table` to `BaseCoordinateFrame`."""
    # TODO first try to determine of a SkyCoord is embedded in the table

    frame = SkyCoord.guess_from_table(data)

    # TODO more robust method of determining epoch
    # like allowing a kwarg to say where it is, or specify it.
    if "obstime" in data.dtype.fields:
        frame.obstime = Time(data["obstime"])
    elif "epoch" in data.dtype.fields:
        frame.obstime = Time(data["epoch"])
    elif "ref_epoch" in data.dtype.fields:
        frame.obstime = Time(data["epoch"])

    elif "obstime" in data.meta:
        frame.obstime = Time(data.meta["obstime"])
    elif "epoch" in data.meta:
        frame.obstime = Time(data.meta["epoch"])
    elif "ref_epoch" in data.meta:
        frame.obstime = Time(data.meta["epoch"])

    return frame

t = Table([[1, 2, 3]*u.deg, [4, 5, 6]*u.deg], names=["ra", "dec"])
t.meta["obstime"] = Time.now()

dg.get_transform(Table, SkyCoord)(t)

dg.get_transform(Table, Table)(t)
[14]:
<SkyCoord (ICRS): (ra, dec) in deg
    [(1., 4.), (2., 5.), (3., 6.)]>
[14]:
Table length=3
radec
degdeg
float64float64
1.04.0
2.05.0
3.06.0
[16]:
del dg
dg = datagraph.TransformGraph()
@dg.register(datagraph.DataTransform, str, None)
def str_to_None(data):
    return None

print(dg.get_transform(str, None)("test this"))

dg._graph
None
[16]:
defaultdict(dict,
            {None: {str: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11f72c450>,
              list: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dee6f90>,
              tuple: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dee6910>,
              dict: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dee91d0>},
             tuple: {str: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dee9050>,
              list: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dfb5310>,
              dict: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dfb5410>},
             list: {str: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dfb5510>,
              tuple: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dfb5610>,
              dict: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dfb5710>},
             str: {tuple: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dfb5810>,
              list: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dfb5910>,
              dict: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dfb5a10>},
             astropy.coordinates.sky_coordinate.SkyCoord: {astropy.table.table.Table: <starkman_thesis.utils.data.datagraph.DataTransform at 0x120b2a290>,
              astropy.coordinates.baseframe.BaseCoordinateFrame: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11dee6dd0>,
              astropy.coordinates.builtin_frames.icrs.ICRS: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11f716750>},
             astropy.coordinates.builtin_frames.icrs.ICRS: {astropy.coordinates.builtin_frames.galactic.Galactic: <starkman_thesis.utils.data.datagraph.DataTransform at 0x11f716610>},
             astropy.table.table.Table: {}})
[18]:
dg._graph[None][str]("into the void")

END