Functional Programming in C# Part 2

2.0 My notes will cover these points:

- What makes a function pure or impure
- Why purity matters in concurrent scenarios
- How purity relates to testability
- Reducing the impure footprint of your code

Functional programming are suckers for pure functions- functions w/ no side effects.

2.1 What is function purity?

mathematical functions are completely abstract entities, although some programming functions are close representation of mathematical functions. as used, we often want a function to do something - to have a side effect. Mathematical functions do nothing of the sort; the only return a value. Mathematical functions exist in a vacuum, so that their results are determined strictly by their arguments.

Pure functions closely resemble mathematical functions they do nothing other than compute an output value based on their input values.

2.1.1This table contrasts pure vs impure functions.

image.png

from the above, what's the side effect?

A function is side effect to have side effects if it does any of the following:

- Mutates global state- 'global' means any state that's visible outside of the function's scope. for example, a private instance field is considered global as it's visible from all methods within the class.
- Mutates the input arguments
- Throws exceptions.
- performs any I/O operation - this includes any interaction between the program and the external world, interacting w/ any process outside the application's boundary.

In summary, pure functions have no side effects, and their output is solely determined by their inputs.

The deterministic nature of pure functions (that is, the fact that they always return the same output for the same input) has some interesting consequences.

pure functions are easy. The fact that outputs only depends on inputs means that the order of evaluation isn't important. This means that the parts of ur program that consist entirely of pure functions can be optimized in a number of ways:

Parallelization- different threads carry out tasks in parallel.
Lazy evaluation- only evaluate values as needed.
Memoization- cache the result of a function so it's only computed once.

using these techniques w/ impure fn can lead to rather nasty bugs. FP advocates that pure functions should be preferred whenever possible.

2.1.2 Strategies for mutating side effects

The strategies for managing side effects depend on the type of side effects.

Isolate IO effects

cause of why functions that perform IO can never be pure:
fn takes a url and returns the resource at that url will yield a different result any time the remote resource changes, or it may throw an error if the connection is unavailable.
fn takes a file path contents to be written to a file may throw an error if the directory not exist, or if process hosting the program lacks write permissions.
fn returns the current time from the system clock will return a different result at any instant.

So any dependency on the external world gets in the way of function purity as the state of the world affects the function's return value.

we can isolate the pure computational parts of programs from the IO. in this way, we minimize the footprint of IO and reap the benefits of purity for pure parts of program.

// assume it's wrapped in a Main method
// mix IO with logic that cloud be extracted in a pure function
WriteLine("Enter your lastName:");
var lastName= ReadLine();
WriteLine($"Hello {lastName}");
//separate logic from IO.
static string Greeting(string lastName) => $"Hello {lastName}";

image.png

impure function can call pure function that perform the translation, but the function that performs translation (pure) can't call any function that performs IO.

Avoid mutating arguments

This is a kind of the mutation of function arguments as it's a bad idea in any programming paradigm.

The below code apply mutation so it's violate this concept

 /* fn => called when the items qyt in an order is modified.  it recomputes the total value of the order.
side effect =>add to the linesToDelete list the order lines whose qyt has been changed to zero */
decimal RecomputeTotal(Order order, List<OrderLine> linesToDelete)
{
   var result = 0m;
   foreach (var line in order.OrderLines)
      if (line.Quantity == 0) linesToDelete.Add(line);
      else result += line.Product.Price * line.Quantity;
   return result;
}

the behavior of the method is now tightly coupled with that of the caller: the caller relies on the method to preform its side effect, and callee relies on the caller to initialize the list.

We can avoid this side effect by returning all computed information to the caller instead.

(decimal, IEnumerable<OrderLine>) RecomputeTotal(Order order)
  => (order.OrderLines.Sum(l => l.Product.Price * l.Quantity)
    , order.OrderLines.Where(l => l.Quantity == 0));

Applying this principle, functions never mutate their input arguments. This would be ideal to enforce this by always using immutable objects- objects that, once created, can't be changed.

2.2. Purity and concurrency

assume that we want to format a list of string as a numbered list; this casing should be standardized, and each item should be preceded w/ a counter.

    public static class StringExt
    {
        public static string ToSentenceCase(this string s) =>
 s == string.Empty ? string.Empty : char.ToUpperInvariant(s[0]) + s.ToLower()[1..]; // = s.ToLower().Substring(1)
    }

public class ListFormatter{
int count;
string PrependCounter (string s)=> $"{++counter}. {s}";public public IEnumerable<string> Format(IEnumerable<string> list) => list.Select(StringExt.ToSentenceCase).Select(PrependCount).ToList();

}

purity in hints:-

ToSentenceCase is pure as its output is strictly determined by he input. Because its computation only depends on the input parameter, it can be made static without any problems.
PrependCounter increments the counter, so it's impure.
Format method, applying both functions to items in list with Select, irrespective of purity.
var shoppingList = new List<string> { "coffee beans", "BANANAS", "Dates" };

new ListFormatter()
   .Format(shoppingList)
   .ForEach(WriteLine);

If list we formatting was big enough, would it make sense to perform the string manipulation in parallel? we tackle these questions next.

Pure finction parallelize well

Pure functions parallelize well and are generally immune to the issues that make concurrency difficult.

list.Select(ToSentenceCase).ToList()
list.AsParallel().Select(ToSentenceCase).ToList()

The first one uses the select method defined on Enumerable to apply the pure function ToSentenceCase to each element in the list.

The second expression is very similar, but it uses methods provided by parallel LINQ. AsParallel turns the list into parallelQuery. as a result, select resolves to the implementation defined on parallelEnumerable. The list will be split into chunks, and several threads will be fired offs to process each chunk. ToList() harvests the results into a list.

2.2.2 Parallelizing impure functions

applying parallelization with impure PrependCounter function

list.Select(PrependCounter).ToList()
list.AsParallel().Select(PrependCounter).ToList()

As PrependCounter increments the counter variable, the parallel version will have multiple threads reading and updating the counter and because there's no locking in place, we'll loase some of updates and end up with incorrect result.we can fix this by Interlocked class when increasing the counter. i use xUnit for testing

        //Expected string length 20 but was 19. Strings differ at index 0.
        //Expected: "1000000. Item1000000"
        //But was:  "85174. Item1000000"
        //-----------^
        [Fact]
        public void ItWorksOnAVeryLongList2()
        {
            var size = 100000;
            var input = Enumerable.Range(1, size).Select(i => $"item {i}").ToList();
            var output = new ListFormatter_ParNaive().Format(input);
            Assert.Equal("100000. Item 100000", output[size - 1]);
        }

2.2. Avoid State mutation

we can avoid this pitfalls of concurrent updates is to remove the problem at the source: don't use shared state to begin with. One of solution is Zip this operation of pairing 2 parallel lists is a common operation in FP.

Enumerable.Zip(
   new[] {1, 2, 3},
   new[] {"ichi", "ni", "san"},
   (number, name) => $"In Japanese, {number} is: {name}")

To refactor the Format,

   public static List<string> Format(List<string> list)
   {
      var left = list.Select(StringExt.ToSentenceCase);
      var right = Range(1, list.Count);
      var zipped = Zip(left, right, (s, i) => $"{i}. {s}");
      return zipped.ToList();
   }

using Zip as an extension method

public static List<string> Format(List<string> list)
   => list
      .Select(StringExt.ToSentenceCase)
      .Zip(Range(1, list.Count), (s, i) => $"{i}. {s}")
      .ToList();

Purity and testability

online banking application, allowing users to make transfers through wen or mobile app. Imagine a customer make a request to transfer. Before booking the transfer, the serve will have to validate this request.

Business scenario: validating a transfer request image.png

Let’s assume that the user’s request to make a money transfer is represented by a MakeTransfer command. A command is a simple data object that encapsulates details about an action to be carried out:

    public abstract class Command
    {
        public DateTime Timestamp { get; set; }
        public T WithTimestamp<T>(DateTime timestamp) where T : Command
        {
            T result = (T)MemberwiseClone();
            result.Timestamp = timestamp;
            return result;
        }

    }

    public class MakeTransfer : Command
    {
        public Guid DebitedAccountId { get; set; }

        public string Beneficiary { get; set; }
        public string Iban { get; set; }
        public string Bic { get; set; }

        public DateTime Date { get; set; }
        public decimal Amount { get; set; }
        public string Reference { get; set; }
    }

We'll only look at the following validation:

The Date field, representing the date on which the transfer should be executed, should not be past.
The BICcode, a standard identifier for the beneficiary’s bank, should be valid.

We follow the single-responsibility principle and write one class for each particular validation.

public interface IValidator<T>
{
   bool IsValid(T t);
}

We have our domain-specific abstraction in place,

 public sealed class BicFormatValidator : IValidator<MakeTransfer>
    {
        static readonly Regex regex = new("^[A-Z]{6}[A-Z1-9]{5}$");
        public bool IsValid(MakeTransfer t) => regex.IsMatch(t.Bic);
    }

    public sealed class BicExistsValidator_Skeleton : IValidator<MakeTransfer>
    {
        readonly IEnumerable<string> validCodes;
        public bool IsValid(MakeTransfer t) => validCodes.Contains(t.Bic);
    }

    public sealed class BicExistsValidator : IValidator<MakeTransfer>
    {
        readonly Func<IEnumerable<string>> getValidCodes;
        public BicExistsValidator(Func<IEnumerable<string>> getValidCodes) => this.getValidCodes = getValidCodes;

        public bool IsValid(MakeTransfer t) => getValidCodes().Contains(t.Bic);
    }

The logic in BicFormatValidator pure as there are no side effects and the result of IsValid is deterministic.

let's check validation of Bic if match the criteria

[Theory]
[InlineData("ABCDEFGJ123", true)]
[InlineData("XXXXXXXXXXX", false)]
public void WhenBicNotFound_ThenValidationFails(string bic, 
bool expected)
//injecting functions as dependencies
=> Assert.Equal(new BicExistsValidator(() => validCodes).IsValid(new MakeTransfer { Bic = bic }), expected);

let's check if transfer is valid in future

        [Fact]
        public void WhenTransferDateIsFuture_ThenValidatorPasses()
        {
            var sut = new DateNotPastValidator();
            var transfer = new MakeTransfer { Date = new DateTime(2030, 01, 12) };

            var actual = sut.IsValid(transfer);
            Assert.True(actual);
        }

there is a side effect that I/O DateTime.UtcNow queries the system clock, which is not in the context of program.

2.3.2 Bringing impure functions under test

Abstract IO operations in an interface, ad to use a deterministic implementation in the test, is the standard object oriented technique for ensuring that unit test behave consistently.

    // testable class depends on interface
    //Bringing impure functions under test
    public class DateNotPastValidator_Testable: IValidator<MakeTransfer>
    {
        private readonly IDateTimeService clock;
        public DateNotPastValidator_Testable(IDateTimeService clock) => this.clock = clock;
        public bool IsValid(MakeTransfer t) => clock.UtcNow.Date <= t.Date.Date;
    }


    // interface
    public interface IDateTimeService
    {
        DateTime UtcNow { get; }
    }

    // "real" implementation
    public class DefaultDateTimeService : IDateTimeService
    {
        public DateTime UtcNow => DateTime.UtcNow;
    }

running unit tests, by injecting a fake pure implementation that does something predictable, such as always returning the DateTime enabling us to write tests that are repeatable.

        private class FakeDateTimeService : IDateTimeService
        {
            public DateTime UtcNow => presentDate;
        }

[Theory]
[InlineData(+1, true)]
[InlineData(00, true)]
[InlineData(-1, false)]
public void WhenTransferDateIsPast_ThenValidationFails(int offset, bool expected)
{
var sut = new DateNotPastValidator_Testable(new FakeDateTimeService());
var transferDate = presentDate.AddDays(offset);
var transfer = new MakeTransfer { Date = transferDate };

var actual= sut.IsValid(transfer);
Assert.Equal(expected, actual);
}

unit tests need to be isolate (no IO) and repeatable.

Reference: Functional Programming in C#: How to write better C# code XUnit Test