Hi all,
I have dataframes as follow:
Row │ year sector roa
│ Int64 String Float64
─────┼─────────────────────────────────────────
1 │ 2017 Construction 1.15
2 │ 2017 Services 1.78
3 │ 2017 Agriculture 1.82
4 │ 2018 Construction 1.05
5 │ 2018 Services 1.56
6 │ 2018 Agriculture 1.55
7 │ 2019 Construction 1.32
8 │ 2019 Services 1.53
9 │ 2019 Agriculture 1.38
10 │ 2020 Construction 1.11
11 │ 2020 Services 1.93
12 │ 2020 Agriculture 1.21
13 │ 2021 Construction 0.03
14 │ 2021 Services 0.06
15 │ 2021 Agriculture 0.36
I would like to calculate the rate of change starting from 2017 to 2021 for each sector. I appreciate any suggestions. Thanks all!
You can use the package ShiftedArrays
.
using DataFrames
import ShiftedArrays: lag
# mock data
using Random; Random.seed!(1234)
dff = DataFrame(year = repeat(2017:2021, inner=5), sector = repeat(["A", "B", "C", "D", "E"],5), roa = rand(25))
# computation
sort!(dff, [:sector, :year]) # sort data by the group and year
transform!(groupby(dff,[:sector]), :roa => lag => :prev_roa) # create lag value
dff.rate_change = (dff.roa ./ dff.prev_roa .- 1 ) * 100 # create rate_change
with output
julia> dff
25×5 DataFrame
Row │ year sector roa prev_roa rate_change
│ Int64 String Float64 Float64? Float64?
─────┼────────────────────────────────────────────────────────
1 │ 2017 A 0.579862 missing missing
2 │ 2018 A 0.639562 0.579862 10.2955
3 │ 2019 A 0.566704 0.639562 -11.3918
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮
23 │ 2019 E 0.806704 0.696041 15.899
24 │ 2020 E 0.939548 0.806704 16.4674
25 │ 2021 E 0.131026 0.939548 -86.0544
19 rows omitted
2 Likes