Note

You can download this example as a Jupyter notebook or start it in interactive mode.

Creating variables

Variables are created and at the same time assigned to the model using the function

model.add_variables

where model is a linopy.Model instance. In the following we show how this function works and what the resulting variables look like. So, let’s create a model and go through it!

[1]:
from linopy import Model
import numpy as np
import pandas as pd
import xarray as xr
m = Model()

First of all it is crucial to know, that the return value of the .add_variables function is a linopy.Variable which is essentially like an xarray.DataArray, but it some additional features. That means it can have an arbitrary number of labeled dimensions. For each coordinate, exactly one representative scalar variable is defined.

The first three arguments of the .add_variables function are 1. lower denoting the lower bound of the variables (default -inf) 2. upper denoting the upper bound (default +inf) 3. coords (default None). These argument determine the shape of the added variable array.

Generally, the function is strongly aligned to the initialization of an xarray.DataArray, meaning lower and upper can be

  • scalar values (int/float)

  • numpy ndarray’s

  • pandas Series

  • pandas DataFrame’s

  • xarray DataArray’s

Note that scalars, numpy objects and pandas objects do not have or do not require dimension names. Thus, the naming of the dimensions are done by xarray if not explicitly passing coords. As we show later, it is very important to take care of the dimension names.

Using scalar values

If we just keep the default, which is -inf and +inf for lower and upper, the code returns

[2]:
m.add_variables()
[2]:
<linopy.Variable 'var0' ()>
array(0)
Attributes:
    binary:   False

which is a variable without any coordinates and with just one scalar variable with label 0. You can pass any scalar to the lower and upper bounds, e.g.

[3]:
m.add_variables(lower=9, upper=15)
[3]:
<linopy.Variable 'var1' ()>
array(1)
Attributes:
    binary:   False

If coords is given, these will be ignored.

Using numpy arrays

If lower and upper are numpy arrays, linopy requires the coords argument not to be None, otherwise an error is raised. Thus, it is helpful to define the coordinates in advance and pass it to the function.

[4]:
coords = pd.RangeIndex(2, name='a'),
lower=np.array([1,2])
m.add_variables(lower=lower, coords=coords)
[4]:
<linopy.Variable 'var2' (a: 2)>
array([2, 3])
Coordinates:
  * a        (a) int64 0 1
Attributes:
    binary:   False

Note three things:

  1. coords is an tuple of indexes as expected by xarray.DataArray.

  2. The shape of lower is aligned with coords.

  3. A name was set in the index creation. This is helpful as we can ensure which dimension the variable is defined on. Otherwise xarray would just insert the dimension names which can lead to unexpected broadcasting later

Let’s make the same example without adding the dimension name to the index:

[5]:
coords = pd.RangeIndex(2),
m.add_variables(lower=lower, coords=coords)
[5]:
<linopy.Variable 'var3' (dim_0: 2)>
array([4, 5])
Coordinates:
  * dim_0    (dim_0) int64 0 1
Attributes:
    binary:   False

The dimension is now called dim_0, any new assignment of variable without dimension names, will also try to use that dimension name. This is not recommended as it possibly bloats the data structure of the model.

Hint: If you want to make sure, you are not messing up with dimensions, create the model with the flag force_dim_names = True, i.e.

[6]:
other = Model(force_dim_names=True)
try:
    other.add_variables(lower=lower, coords=coords)
except ValueError as e:
    print("This raised an error:", e)
This raised an error: Added data contains non-customized dimension names. This is not allowed when setting `force_dim_names` to True.

Using pandas objects

Pandas objects always have indexes but do not require dimension names. It is again helpful to ensure that the variable have explicit dimension names, when passing lower and upper without coords. This can be done by either passing the dims argument to the .add_variables function, i.e.

[7]:
lower = pd.Series([1,1])
upper = pd.Series([10, 12])
m.add_variables(lower, upper, dims='my-dim')
[7]:
<linopy.Variable 'var4' (my-dim: 2)>
array([6, 7])
Coordinates:
  * my-dim   (my-dim) int64 0 1
Attributes:
    binary:   False

or naming the indexes and columns of the pandas objects directly, e.g.

[8]:
lower = pd.Series([1,1]).rename_axis('my-dim')
upper = pd.Series([10, 12]).rename_axis('my-dim')
m.add_variables(lower, upper)
[8]:
<linopy.Variable 'var5' (my-dim: 2)>
array([8, 9])
Coordinates:
  * my-dim   (my-dim) int64 0 1
Attributes:
    binary:   False

Note: If lower and upper do not have the same dimension names, the arrays are broadcasted, meaning the dimensions are spanned:

[9]:
lower = pd.Series([1,1]).rename_axis('my-dim')
upper = pd.Series([10, 12]).rename_axis('my-other-dim')
m.add_variables(lower, upper)
[9]:
<linopy.Variable 'var6' (my-dim: 2, my-other-dim: 2)>
array([[10, 11],
       [12, 13]])
Coordinates:
  * my-dim        (my-dim) int64 0 1
  * my-other-dim  (my-other-dim) int64 0 1
Attributes:
    binary:   False

Now instead of 2 variables, 4 variables were defined.

The similar bahvior accounts for the case when passing a DataFrame and a Series without dimension names. The index axis is the first axis of both object, thus these are expected to be the same (Note that pandas convention, is that Series are aligned and broadcasted along the column dimension of DataFrames):

[10]:
lower = pd.DataFrame([[1,1, 2], [1,2,2]])
upper = pd.Series([10, 12])
m.add_variables(lower, upper)
[10]:
<linopy.Variable 'var7' (dim_0: 2, dim_1: 3)>
array([[14, 15, 16],
       [17, 18, 19]])
Coordinates:
  * dim_0    (dim_0) int64 0 1
  * dim_1    (dim_1) int64 0 1 2
Attributes:
    binary:   False

Again, one is always safer when explicitly naming the dimensions:

[11]:
lower = lower.rename_axis(index='my-dim', columns='my-other-dim')
upper = upper.rename_axis('my-dim')
m.add_variables(lower, upper)
[11]:
<linopy.Variable 'var8' (my-dim: 2, my-other-dim: 3)>
array([[20, 21, 22],
       [23, 24, 25]])
Coordinates:
  * my-other-dim  (my-other-dim) int64 0 1 2
  * my-dim        (my-dim) int64 0 1
Attributes:
    binary:   False

The coords and dims argument is applied to lower and upper individually. Hence, when mixing array’s of different shapes, setting coords or dims will raised an error:

[12]:
coords = pd.Index([1,2]), pd.Index([3,4,5])
try:
    m.add_variables(lower, upper, coords=coords)
except ValueError as e:
    print("This raises an error:", e)
This raises an error: coords is not dict-like, but it has 2 items, which does not match the 1 dimensions of the data

Using xarray DataArray’s

This is the most straight-forward and recommended method to create variables, as DataArray’s have a well defined set of dimension names.

[13]:
lower = xr.DataArray([1,2,3], coords=(pd.RangeIndex(3),), dims='my-dim')
m.add_variables(lower)
[13]:
<linopy.Variable 'var10' (my-dim: 3)>
array([26, 27, 28])
Coordinates:
  * my-dim   (my-dim) int64 0 1 2
Attributes:
    binary:   False

Again, you can arbitrarily broadcast dimensions when passing DataArray’s with different set of dimensions. Note however, linopy expects non empty coordinates, it order to keep the model structure clean.