Managing Pandas’ deprecation of the Series first() and last() methods.

Have you stumbled across this warning in your code after updating Pandas: “FutureWarning: last is deprecated and will be removed in a future version. Please create a mask and filter using `.loc` instead“? In this post, we’ll explore how that method works and how to replace it.

I’ve always loved the use-case driven nature of methods and functions in the Pandas library. Pandas is such a workhorse in scientific computing in Python, particularly when it comes to things like timeseries data and dealing with calendar-labeled data in particular. So it was with a touch of frustration and puzzlement that I discovered that the last method had been deprecated, and its removal from Pandas’ Series and DataFrame types is planned. In the Data Analysis with Pandas` course, we used have an in-class exercise where we recommended getting the last 4 weeks’ data using something like this:

In [1]: import numpy as np
   ...: import pandas as pd
   ...: rng = np.random.default_rng(42)
   ...: measurements = pd.Series(
   ...:    data=np.cumsum(rng.choice([-1, 1], size=350)),
   ...:    index=pd.date_range(
   ...:        start="01/01/2025",
   ...:        freq="D",
   ...:        periods=350,
   ...:   ),
   ...:)
In [2]: measurements.last('1W')
<ipython-input-5-ec16e51fe7ce>:1: :1: FutureWarning: last is deprecated and will be removed in a future version. Please create a mask and filter using `.loc` instead
  measurements.last('1W')
Out[2]:
2025-12-15   -7
2025-12-16   -8
Freq: D, dtype: int64

This has the really useful behavior of selecting data based on where it falls in a calendar period. Thus the command above usefully returns the two elements from our Series that occur in the last calendar week, which begins (in ISO format) on Monday, Dec 15.

The deprecation warning says “FutureWarning: last is deprecated and will be removed in a future version. Please create a mask and filter using .loc instead.” Because last is a useful feature, I wanted to take a closer look to see if I could understand what’s going on and what the best way to replace it would be.

Poking into the code a bit, we can see that the.last method is a convenience function that uses pd.tseries.frequencies.to_offset to turn '1W', technically a designation of period, into an offset, which is subtracted from the last element of the DatetimeIndex, yielding the starting point for a slice on the index. From the definition of last:

    ...
    offset = to_offset(offset)

    start_date = self.index[-1] - offset
    start = self.index.searchsorted(start_date, side="right")
    return self.iloc[start:]

Note that side='right' in searchsorted find the first index greater than start_date. We could wrap all of this into an equivalent statement that yeilds no FutureWarning:

In [3]: start = measurement.index[-1] - to_offset('1W')
In [4]: measurement.loc[measurement.index > start]
Out[4]:
2025-12-15   -7
2025-12-16   -8
Freq: D, dtype: int64

There’s a better option, though, which is to use pd.DateOffset. It’s a top-level import, and it gives you control over when the week starts, which to_offset does not. Remember we are using ISO standards, so Monday is day 0:

In [5]: start = measurements.index[-1] - pd.DateOffset(weeks=1, weekday=0)
In [6]: measurements.loc[measurements.index > start]
Out[6]:
2025-12-15   -7
2025-12-16   -8
Freq: D, dtype: int64

Slicing also works, even if the start point doesn’t coincide with a location in the index. Mixed offset specifications are possible, too:

In [7]: measurements.loc[measurements.index[-1] - pd.DateOffset(days=1, hours=12):]
Out[7]:
2025-12-15   -7
2025-12-16   -8
Freq: D, dtype: int64

The strength of pd.DateOffset is that it is calendar aware, so you can specify the day of the month, for example:

In [8]: measurements.loc[measurements.index[-1] - pd.DateOffset(day=13):]
Out[8]:
2025-12-13   -7
2025-12-14   -6
2025-12-15   -7
2025-12-16   -8
Freq: D, dtype: int64

There’s also the non-calendar-aware pd.Timedelta you can use to count back a set time period without taking day-of-week or day-of-month into account. Note: as with all Pandas location-based slicing, it is endpoint inclusive, so 1 week yields 8 days’ measurements:

In [9]: measurements.loc[measurements.index[-1] - pd.Timedelta(weeks=1):]
Out[9]:
2025-12-09   -9
2025-12-10   -8
2025-12-11   -7
2025-12-12   -8
2025-12-13   -7
2025-12-14   -6
2025-12-15   -7
2025-12-16   -8
Freq: D, dtype: int64

You may have noticed I prefer slicing notation, whereas the deprecation message suggests using a mask array. There’s a performance advantague to using slicing, and the notation is more compact than the mask array but less so than the last() method. In IPython or Jupyter, we can use %timeit to quantify the difference:

In [10]: %timeit measurements.loc[measurements.index[-1] - pd.DateOffset(day=13):]
45.7 μs ± 2.36 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [11]: %timeit measurements.last('4D')
56.3 μs ± 14.9 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [12]: %timeit measurements.loc[measurements.index >= measurements.index[-1] - pd.DateOffset(day=13)]
89.2 μs ± 6.31 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

After spending some time with git blame and the Pandas-dev source code repository, the reasons for the deprecation of the first and last methods make sense:

  • there is unexpected behavior when passing certain kinds of offsets
  • they don’t behave analogously to SeriesGroupBy.first and SeriesGroupBy.last
  • they don’t respect time zones properly

Hopefully this has been a useful exploration of pd.Series.last (and .first), their deprecation, and how to replace them in your code with the more-explicit and better-defined masks and slices. Happy Coding!

Back to blog

Leave a comment

Please note, comments need to be approved before they are published.